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Abstract. We consider a new kind of interpretation over relational structures: finite sets 
interpretations. Those interpretations are defined by weak monadic second-order (WMSO) 
formulas with free set variables. They transform a given structure into a structure with 
a domain consisting of finite sets of elements of the orignal structure. The definition of 
these interpretations directly implies that they send structures with a decidable WMSO 
theory to structures with a decidable first-order theory. In this paper, we investigate the 
expressive power of such interpretations applied to infinite deterministic trees. The results 
can be used in the study of automatic and tree-automatic structures. 



1. Introduction 

Computational model theory is concerned with the study of algorithmic properties of 
classes of infinite structures (cf. |BG04] ), where the focus is on the problem of model 
checking such structures against specifications written in some logic, i.e., deciding for a 
given structure and logical formula if the formula holds in this structure. This problem 
setting has been studied for various instantiations of the two parameters, i.e., the way to 
represent the structures, and the logic to write the specifications. The most prominent 
logics in this context are first-order (FO) logic and monadic second-order (MSO) logic, and 
they have led to two tracks of research trying to identify classes of structures for which the 
respective logic is decidable. 

One way of defining such classes uses, e.g., words or trees for representing the elements of 
the structure and uses simple transformations based on transducers or rewriting to define 
the relations of the structure. In this way, one obtains, e.g., the classes of automatic 
[Hod831 IKN951 IBGOO] and tree-automatic [DT901 IBGOO] struct ures, for which the FO- 
theory is decidable, and the classes of pushdown-graphs [MS85] and prefix recognizable 
graphs [Cau96] with decidable MSO-theory. 
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In the case of automatic structures, the decidability results are based on the strong clo- 
sure properties of finite automata, which are used to define the relations. Other techniques, 
e.g. in [KL02] for rewriting in trace monoids, are based on Gaifman's locality theorem. 

The decidability results for MSO logic on pushdown and prefix recognizable graphs are 
derived from the results of Biichi and Rabin establishing the equivalence of monadic second- 
order logic with certain families of finite automata accepting infinite trees (cf. [Tho 97]). 
These results are also underlying the more recent work [CT02] and |KNUWQ5j showing the 
decidability of MSO logic over certain classes of infinite words and infinite trees, respectively. 

A different and more systematic approach for defining and studying classes of infinite 
structures is to use operations for transforming structures. An important operation of this 
kind is the model-theoretic interpretation. Such an interpretation defines a new structure 
'inside' a given one by means of logical formulas describing the domain and the new relations. 
Depending on whether these defining formulas are FO or MSO one speaks of FO- and 
MSO-interpretations. An important property of these interpretations is that decidability 
results easily transfer from the given structure to the resulting structure, i.e., applying an 
FO-interpretation to a structure with a decidable FO-theory results in a structure with 
decidable FO-theory, and similarly for MSO. 

As mentioned in [BG 04]. this suggests a new way of defining interesting classes of 
infinite structures: fix an underlying structure (with good algorithmic properties) and con- 
sider all structures that can be obtained by applying interpretations of a certain kind. In 
this way, one obtains the automatic structures by FO-interpretations from, e.g., a suitable 
extension of Presburger arithmetic [Blu99], and the prefix recognizable structures by MSO- 
interpretations from the infinite binary tree [BluOl] . This idea has been pursued further in 
[Cau02j . where MSO-interpretations and unravelling of graphs are iterated, leading to an 
infinite hierarchy of graphs (or structures) with a decidable MSO-theory. 

All the methods and results described so far can be separated into those concerned 
with FO logic (sometimes extended by a reachability relation [DT90, Col02]) and those 
dealing with MSO logic (sometimes with only restricted kind of set quantification as in 
[Mad03]). To our knowledge, there has been no systematic work on relating these two 
areas. In this paper, we bridge this gap by studying a new kind of interpretation, named 
finite sets interpretation, allowing to define classes of structures with decidable FO-theory 
from structures with decidable MSO-theory. To be more precise, we are considering weak 
MSO (WMSO) logic, i.e., MSO logic where quantification is restricted to finite sets. The 
idea for these interpretations is rather simple: the domain of the new structure does not 
consist of elements of the old structure but of finite sets of elements of the old structure. 
The relations are specified by WMSO-formulas with free set variables (the number of which 
corresponds to the arity of the relation). In this way, FO-formulas over the new structure 
can directly be translated into WMSO-formulas over the old structure. The use of WMSO 
ensures that the universe of the resulting structure is countable. It is not clear whether 
using standard MSO and then restricting to those resulting structures with a countable 
universe gives the same class of structures. 

Using the equivalence of WMSO logic and finite automata (over finite words and trees) 
it is not difficult to see that the classes of automatic and tree automatic structures can be 
obtained by finite sets interpretations from (N, succ), i.e., the natural numbers with succes- 
sor relation, and from the infinite binary tree, respectively (see |Col04j ). The connection 
between automatic structures and finite sets interpretations applied to (N, succ) has already 
appeared before in the literature. In [ER66| the authors show that the infinite binary tree 
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together with the equal level relation can be generated from (N, succ) by a finite sets in- 
terpretation. Today this extension of the infinite binary tree is known as a generator of 
automatic structures in the sense that every automatic structure can be obtained from it 
by a first-order interpretation. A similar result is given in [Rub04j but for another generator 
of the automatic structures. 

This raises the question of what happens when we apply finite sets interpretations 
to other structures with decidable WMSO-theory, e.g., the structures from the hierarchy 
defined in [Cau02j . Though this hierarchy is strict, it is not a priori clear whether this is 
also true for the hierarchy obtained after applying finite sets interpretations. To answer 
questions of this kind one has to study the expressiveness of finite sets interpretations and 
to provide tools for showing that a structure cannot be obtained by such an interpretation 
applied to a given structure. In particular, such tools can then be used to answer questions 
on automatic structures because these can be obtained by finite sets interpretations (as 
mentioned above). More precisely, we concentrate ourselves in understanding what are the 
structures which are finite sets interpretatable in trees. All the examples mentionned so far 
fall in this category. 

We contribute to this study via two results. The first one, Theorem I4.1( establishes 
that the quotient of a structure finite sets interpretable in a deterministic tree is itself finite 
sets interpretable in that tree. This result was known for automatic structures, and was 
open for tree-automatic structures. Theorem 14.11 is a generalisation of those two cases. 

The second and main result, Theorem 14.31 allows to reduce questions on definability by 
finite sets interpretations to questions on WMSO-interpretability. A precise formulation of 
it (in its simplest form, see Corollary [43]) reads as follows: If the class of structures definable 
by finite sets interpretations from a structure S is included in the class of structures definable 
by finite sets interpretations from a deterministic tree t, then S is WMSO-interpretable in t. 
Those questions of WMSO-interpretability in trees are well understood since they can be 
reformulated in terms of clique-width. The clique-width is a measure of the complexity 
of graphs which has been first introduced for finite graphs [Cou89j . and then extended to 
infinite ones [Cou04j . In this latt er case, the equivalence is expressible as follows: "a graph 
is WMSO-interpretable in a tree iff it is of bounded clique-width". Our result implies that 
if we can show that S is not WMSO-interpretable in t, then there are structures that can be 
obtained by a finite sets interpretation from S but not from t. A more technical formulation 
of the main result also explicitly gives such a structure. 

We demonstrate the use of Theorems 14.11 and 14.31 by showing some non-definability 
results, the strictness of the hierarchy mentioned above, and a result on intrinsic definability 
of relations related to similar questions studied for automatic structures (cf. [Bar06]). 

The remainder of the paper is structured as follows. In Section [2] we give the basic 
definitions and introduce finite sets interpretations, and in Section [3] we give the connection 
to automatic structures. Section 2] is devoted to the study of finite sets interpretations 
applied to trees. In particular, our two results, Theorems 14.11 and 14.31 are stated in this 
section. In Section [5] we present some applications of our results. Finally, Section [6] is 
devoted to the proof of the main result. 
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2. Definitions and elementary results 

In this section we provide the basic definitions used in the paper, i.e., relational struc- 
tures, trees, logic, automata, and finally interpretations, the main subject of this work. We 
end this section by giving some elementary results on finite sets interpretations. 

2.1. Structures and trees. We consider (relational) structures S = (U, R±, . . . , Rjy) where 
U is the universe of the structure and for each i, R4 C U Ti is a relation of arity r,- L for a 
natural number rj. The names of the Ri together with their arities form the signature of 
the structure. Trees, as defined below, can be seen in a natural way as particular instances 
of such structures. 

We will be dealing with infinite binary labeled trees. From now, we simply write 'trees'. 
Formally, a tree labeled by a finite alphabet £ is a partial mapping t : {0, 1}* — > £ with 
prefix closed domain dom(t), and such that if ul E dom(t) then also uO G dom(t). The 
elements of the domain are called nodes. A node u such that uO is also a node is called an 
inner node, else it is called a leaf. By C we denote the prefix ordering on nodes, also called 
the ancestor order. For technical simplifications we will mostly consider purely binary trees, 
i.e., such that every node is either a leaf or has two sons. 

Seen as a structure a tree labeled by £ has as universe the domain of the tree and con- 
tains the following relations: the unary relations So and Si meaning 'being a left successor 
(resp. a right successor)' (for i G {0, 1}, Si(u, v) holds if v = ui) and for each a G £ a unary 
relation a interpreted as the set of elements sent to a by t. 

We will be considering two particular infinite trees, namely Ai and A2. The tree Ai 
is the unlabeled tree of domain 0*. We will identify in a natural way this tree with the 
structure (N, succ). The tree A2 is the unlabeled tree of domain {0,1}*, also called the 
infinite binary tree. 

2.2. Logic and automata. We use the standard definitions for first-order (FO) and weak 
monadic second-order (WMSO) logic. FO-formulas are built up from atomic formulas using 
first-order variables (interpreted by elements of the structure and usually denoted by letters 
x, y, z) and the relation symbols from the signature under consideration. Complex formulas 
are constructed using boolean connectives and quantification over first-order variables. 

For WMSO-formulas one can additionally use monadic second-order variables (inter- 
preted by finite sets of elements of the structure and usually denoted by capital letters 
X, Y, Z), quantification over such variables, and the membership relation x G X. If the 
variables are interpreted by arbitrary sets instead of finite sets, then we speak of MSO. 

In order to deal with WMSO-formulas on trees, we use automata. Those automata are 
more general than WMSO-formulas since they have the expressiveness of full MSO logic 
on trees. But for our purpose this doesn't harm because we only use the translation in 
one direction, namely from formulas to automata, and on deterministic trees one can define 
finiteness of a set in MSO (a set is finite if its prefix closure does not contain an infinite 
path), meaning that for each WMSO-formula there is an equivalent MSO-formula. 

Technically, we use nondeterministic parity automata (or simply automata), which are 
tuples (£, Q, q m , S, Q) with a finite set Q of states, initial state q in , transition relation 5 Q Qx 
QxEx Ql±)£ x Q, and priority mapping Q : Q — > N. Recall that we only consider binary trees. 
Given a tree t and an automaton, a run of this automaton on t is a mapping p : dom(t) — > Q 
such that (p(u0) , p(ul) , t(u) , p(u)) G 5 for each inner node u, and (t(v),p(v)) G 5 for every 
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leaf v. A run is accepting if p(e) = q m and for all infinite branches (maximal totally ordered 
sequences of nodes) v\,V2, ■ ■ ■ , liminfj £l(p(vi)) is even. We say that a tree t is accepted by 
an automaton if there is an accepting run of this automaton on t. For basic properties of 
such automata (such as closure under the Boolean operations) and their relation to logic, 
we refer the reader to [Tho97j . 

We are interested here in automata running on a fixed underlying tree t with additional 
markings representing (tuples of) subsets of its domain. To mark a certain subset X of 
dom(t) we can put additional labels on the tree. Formally, the tree t annotated by X is the 
tree with the same domain as t and labels from E x {0, 1}, where a node u is labeled by the 
pair (t(u),0) if u £ X and (t(u), 1) if u G X. In the same way one can also annotate a tree 
with tuples of subsets of its domain using a separate {0, l}-component for each set. 

If t is fixed and X\ , . . . , X n are subsets of dom (t) , then we say that an automaton 
accepts the tuple (Xi, . . . ,X n ) if it accepts t with the additional labelings corresponding 
to the tuple (Xi, . . . ,X n ) as explained above. If we consider all the tuples accepted by 
an automaton, we obtain a relation over the subsets of dom(t). We call this relation the 
relation recognized or accepted by the automaton. 

A WMSO-formula with free set variables X±, . . . ,X n also defines a relation over the 
subsets of dom(t). Throughout the paper we make use of the following result stating that 
for each WMSO-formula there is an equivalent automaton. The proof of this can easily be 
inferred from the equivalence of MSO and automata over trees and from the fact that over 
trees each WMSO-formula can be translated into an equivalent MSO-formula. 

Theorem 2.1 (cf. [Tho97| ). For each WMSO-formula th ere is an automaton such that for 
each tree t the relation over subsets of dom(t) defined by the formula is the same as the one 
accepted by the automaton. 

2.3. Interpretations. Interpretations are a standard tool in logic allowing to define trans- 
formations of structures by means of logical formulas. This technique allows easy transfer 
of theories from one structure onto another. 

Definition 2.2. An interpretation is a tuple (5, c &_r 1 , . . . , &r n ) of formulas. The formula 5 
has only one free variable, and each formula <£fl. has rj free variables. By our convention, 
for weak monadic variables we use capital letters X and X\ , . . . , X Ti , and small letters x 
and xi, . . . ,x ri in the case of first-order variables. 

An interpretation is FO if the formulas are first-order (and hence the free variables 
are also of first-order). An interpretation is WMSO if the formulas are weak monadic and 
the free variables are first-order. An interpretation is finite sets if the formulas are weak 
monadic and the free variables are weak monadic. 

The application of an interpretation to a structure is defined in the standard way. The 
only difference is that for finite sets interpretations the elements of the obtained structure 
are finite subsets of the universe of the original structure instead of elements of the original 
structure. Formally, given a structure S and an interpretation 1 = (5, $_r 15 . . . , &r n ), the 
structure T{S) has for universe 

• {ueU s : S \= S(u)} if X is a FO or WMSO interpretation, 

• {U C U s : U finite, S \= 5(U)} if 2 is a finite sets interpretation, 



6 



T. COLCOMBET AND C. LODING 




{2} {1,2} {0,2} {0,1,2} 

/ \ / \ / \ / \ 

{3} {2,3} {1,3} {1,2,3} {0,3} {0,2,3} {0, 1, 3}{0, 1, 2, 3} 



Figure 1: The nodes of the infinite binary tree A2 coded by sets of natural numbers 



and the interpretation of each symbol Ri is defined by 

R l = {(U 1 ,..., U ri ) G {U 1 ^ : S \= $i(C7i, . . . , U n )}. 

One can note at this point that natural sets interpretations can as well be defined in a 
similar way, simply by removing the finiteness hypothesis on sets and using MSO instead 
of WMSO. At the end of this section we briefly comment on such possible variants for the 
definition. 

The following example already appears in |ER66| . It shows how the complete binary 
tree extended with the equal level relation can be defined by a finite sets interpretation from 
the natural numbers with successor relation. This extension of the binary tree is well-known 
as a generator of the automatic structures, in the sense that each automatic structure can 
be obtained from it by an FO-interpretation (cf. [BG OOj). 

Example 2.3. We show how to obtain the structure ({0, 1}*, <So, Si, Q, el), i.e., the infi- 
nite binary tree extended with the prefix and equal level relations, by a finite sets inter- 
pretation from the infinite unary tree Ai, i.e., from the natural numbers with successor 
Ai = (N, succ). To realize this we have to code the nodes of the tree by finite sets of natu- 
ral numbers and to describe the relations So (for left successor), S\ (for right successor), C 
(for prefix), and el (for equal level) by means of WMSO formulas. 

The coding of the nodes is depicted in Figure[TJ A node u G {0, 1}* is represented by the 
set of positions corresponding to letter 1 in u and additionally by its length. For example, 
the node 100 is coded by {0, 3} because its length is 3 and only position is labeled 1. We 
now define the finite sets interpretation X = (6, $s , $c, $ e l) such that X(Ai) yields 
the binary tree depicted in Figure Q] together with the relations C and el. In the formulas 
we use abbreviations like < and max that can easily be defined by WMSO-formulas in Ai. 

• S(X) := 3x(x G X) (all finite sets except are used in the coding). 

• $c(Xi, X 2 ) ■= max(Ai) < max(A 2 ) A Vx(x < max(Xi) — > (x G X\ <-> x G X 2 )). 

• $ So (Xi,X 2 ) := $c(Xi 5 X 2 ) A max(A 2 ) = max(Ai) + 1 A max(Ai) £ X 2 . 

• $ Sl {x 1 ,X 2 ) := $n(x 1 ,X 2 ) Amax(X 2 ) = max(Ai) + 1 Amax(Xi) G X 2 . 

• <& e i(Xi,X 2 ) := max(Xi) = max(X 2 ) 

Let us proceed with some elementary considerations. Obviously, finite sets interpreta- 
tions are not closed under composition. But, as stated in the following proposition, applying 
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an FO-interpretation after, or a WMSO-interpretation before a finite sets interpretation, 
does not give more expressive power. 

Proposition 2.4. Let T\ be a FO-interpretation, I2 be a WMSO-interpretation, and I be 
a finite sets interpretation. Then Ii o I o I 2 is effectively a finite sets interpretation. 

Proof. As for standard interpretations. □ 

A straightforward as well as essential consequence of this is expressed in the following 
corollary. 

Corollary 2.5. The image of a structure of decidable WMSO-theory by a finite sets inter- 
pretation has a decidable FO-theory. 

To finish our elementary considerations on finite sets interpretations, we present Propo- 
sition 12.71 which is a form of converse to Proposition 12.41 It states that every finite sets in- 
terpretation can be described as the composition of a specific one, called the weak powerset 
interpretation, and a first-order interpretation. 

Definition 2.6. Let V w be the finite sets interpretation that sends every structure S of 
signature X onto a structure of signature SU{^} where ^ is a new binary symbol, such 
that 

• the universe is the set of finite subsets of the universe of S, 

• each symbol R in X has the same interpretation as in S but over singletons instead 
of elements, 

• the interpretation of ^ corresponds to the subset ordering. 
The interpretation V w is called the weak powerset interpretation. 

This interpretation allows to reconstruct all other finite sets interpretations as stated 
in the following proposition which is obtained by a simple syntactic translation of formulas. 

Proposition 2.7. For each finite sets interpretation I there exists a FO-interpretation T\ 
such that T\ o V w = T. 

Possible variants. The definition of finite sets interpretations that we provide uses WMSO 
and one might wonder why we restrict ourselves to this logic. At least two modifications of 
the definition are very natural: 

• replace WMSO logic with MSO logic in the formulas (but still only consider finite 
sets for the free variables), 

• use full MSO and produce a structure over the powerset of the original structure 
instead of its finite subsets. 

If we use the first extension, then Proposition 12 . 71 fails . All other results in this paper would 
remain unchanged. 

The second extension leads to what can be naturally called sets interpretation. The 
formulas are MSO, as well as the free variables. When applying such a sets interpretation 
to a structure, one produces a structure of universe included in the powerset of the universe 
of the original structure. 

All the results presented above remain valid for this variant. However, all the results 
developed below in this work make explicit use of the finiteness of the sets representing 
elements of the new structure. In particular, one can conjecture that Theorem 14.11 would 
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fail for sets interpretations. While the conjecture would be that Theorem 14.31 still holds for 
sets interpretations. 

This new kind of interpretation allows to define structures of uncountable universe 
from structures of countable universe. This makes a major difference with respect to fi- 
nite sets interpretations. In general, the relation between set interpretations and finite set 
interpretations is not yet understood. In particular, we do not know whether the count- 
able structures obtainable by sets interpretations from trees coincide with the structures 
obtainable by finite sets interpretations from trees. 

3. Automatic-like structures 

The line of research that has inspired finite sets interpretations is the one of automatic 
structures. Automatic structures in their common acceptation are structures with a universe 
consisting of a regular language of words, and the relations defined by half-synchronized 
transducers. The key reason for introducing such structures is that — thanks to the good 
closure properties of finite automata — they naturally possess decidable first-order theories. 

Classical variants of those structures consider universes consisting of infinite words {tu- 
automatic structures), or consisting of trees, finite or infinite (namely the tree-automatic 
and w-tree-automatic structures). Some definitions also allow to quotient the structure by a 
congruence (that is defined in the same way as the other relations) . This extension does not 
increase the expressiveness of automatic nor tree-automatic structures (cf. Corollary 14.21 
below). The question whether quotienting increases the expressive power of w-automatic 
and w-tree-automatic structures is open. 

Historically, automatic as well as w-automatic structures have been introduced by Hodg- 
son [Hod83] . Khoussainov and Nerode introduce the notion of automatically presentable 
theories [KN9 5] . starting the study of definability in automatic structures. The extension 
to tree-automatic structures can be traced back, in a different framework, to the work of 
Dauchet and Tison [DT90J. Blumensath and Gradel [Blu991 [BGOOj then formalize the 
notion of tree-automatic structure and add to it the family of w-tree-automatic struc- 
tures. Independe ntly, the study of 3-manyfolds lead to the particular case of automatic 
groups [ECH+92] . 

3.1. Word-automatic structures. In the case of words a relation R C (E*) r is automatic 
if there is a finite automaton accepting exactly the tuples (wx, . . . , w r ) G R, where the 
automaton reads all the words in parallel with the shorter words padded with a dummy 
symbol o. Formally, for w%, . . . , w r G E* we define 

G (£-)* 

where S = SU {o}, n is the maximal length of one of the words Wi, and ay is the jth letter 
of u>i if j < \wi\ and o otherwise. A language L C ((Su{o}) r )* defines a relation Rl C (E*) r 
in the obvious way: (wi, . . . , w r ) G Rl iff W\ (S> • • • <8> w r G L. A tuple (L, L\, . . . , L n ) of 
languages LCS' and Lj C (EJ)* defines a structure of universe L with the relations -R^ 
of arity r^. A structure S = (U,Ri, . . . , R n ) is automatic if it is isomorphic to a structure 
of the above kind for regular languages L, L\, . . . , L n . 



Wl (g> 



w r 







a 'ln 


<1 




a rn 
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The class of w-automatic structures is defined in the same way with infinite words 
instead of finite ones. In this case the definition is even simpler as there is no need for 
padding shorter words. 

3.2. Tree- automatic structures. To define tree-automatic structures we need a way to 
code tuples of finite trees, i.e., we need an operation (g) for finite trees. For a tree t : 
dom(t) -> S let t° : {0, 1}* -> £<> be defined by t°(u) = t(u) if u G dom(t), and t*(u) = o 
otherwise. For finite X-labeled trees t%, . . . , t r we define the S^-labeled tree t = t\ <g> • • • ®t r 
by dom(t) = dom(t{) U • • • U dom(t r ) and t(u) = (tf(u), . . . , t^{u)). When viewing words as 
unary trees, this definition corresponds to the operation <g> as defined for words. As in the 
case of words a set T of finite S^-labeled trees defines the relation Rt by (ti, . . . , t r ) G Rt 
iff t\ (g> ■ ■ ■ (g) t r G T, A structure is called tree-automatic if it is isomorphic to a structure 
given by a tuple (T, T\, . . . , T n ) of regular tree languages in the same way as for words. 

Again, the definitions for w-tree-automatic structures are the same with w-trees, i.e., 
trees of domain {0, 1}* instead of finite trees. 

One should note here that we only consider so called injective presentations of automatic 
structures. A more general definition as, e.g., in [Blu99] additionally uses a regular language 

C (S^)* defining an equivalence relation identifying words representing the same element 
of the structure (and similarly for the other variants of automatic structures). It is known 
that injective presentations are sufficient for automatic structures (KN95) meaning that all 
structures that are automatic in the more general sense are also automatic according to our 
definition. The corresponding result for tree-automatic structures is established below, see 
Corollary |42l 

3.3. Automaticity via interpretations. Recall that A\ is the (unlabeled) infinite unary 
tree, i.e., the natural numbers with successor, and that A2 is the (unlabeled) infinite binary 
tree. The following fact is a straightforward consequence of the definition of automatic 
structures and of the equivalences between WMSO-logic and automata. The first claim 
also appears in |Rub044 Thm. C.2.11, page 50]. 

Proposition 3.1. The following holds up to isomorphism 

• A structure is automatic iff it is finite sets interpretable in A\ . 

• A structure is tree-automatic iff it is finite sets interpretable in A2. 

Proof (sketch). As already mentioned, the first claim is shown in [Rub04j. 

We describe here how to proceed for tree-automatic structures, starting by an explana- 
tion how to obtain a tree-automatic presentation from a finite sets interpretation 2 in A2. 
A finite subset X of A2 is coded as a finite tree as follows: the tree has the smallest domain 
that contains all elements from X and is labeled at positions that are not in X and 1 at 
positions that are in X. For each formula of X there is an equivalent parity automaton over 
A2. This automaton can easily be turned into an automaton over finite trees accepting the 
corresponding relation over the codings as just described. 

For the other direction we start from a tree-automatic presentation of a structure. 
A first thing to note is that a singleton alphabet of tree labels is sufficient because each 
symbol from a larger alphabet can be encoded in a finite pattern; in this construction, we 
use the leaf/non-leaf nature of each node for coding information. Since now the alphabet is 
a singleton, one naturally encodes a tree t by the finite set dom(t). Now pick an automaton 
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from the tree-automatic presentation accepting a relation and pick a tuple of such sets coding 
finite trees. Using standard techniques for translating automata to logic (cf. }Tho97| ). 
one describes in WMSO that the corresponding tuple of finite trees is accepted by the 
automaton. □ 

Note that the w-automatic and w-tree-automatic structures satisfy the same equiva- 
lences where finite sets interpretations are replaced by sets interpretations. 

4. Finite sets interpretations on trees 

The power of finite sets interpretations makes it difficult to obtain results for the general 
case where such interpretations are applied to arbitrary structures. For this reason, in this 
article we only consider the special case of finite sets interpretations applied to deterministic 
trees. 

This restriction can be justified in two ways. The first justification is that on trees there 
are specific tools suitable for treating WMSO questions: their automata equivalents. The 
second justification is that if Seese's conjecture [See91] holds — stating that all structures of 
decidable weak monadic second-order theory are WMSO-interpretations of trees — then the 
only structures that we can prove to have decidable first-order theory using Corollary 12.51 
are finite sets interpretations of trees (for recent work on Seese's conjecture see [CO06 ). 

We give here two results concerning finite sets interpretations applied to deterministic 
trees. The first one — in Subsection 14.11 — shows that finite sets interpretations on de- 
terministic trees followed by a quotient are simply finite sets interpretations. The second 
result — subject of Subsection 14.21 — concerns finite sets interpretations applied to trees 
leading to powerset lattices. The technical core of the proof is given in Section [6j 

4.1. Quotienting finite sets interpretations on trees. We show here that if a structure 
is finite sets interpretable in a deterministic tree containing a symbol interpreted as a 
congruence on the structure, then it is possible to directly obtain the quotiented structure 
by a finite sets interpretation. 

A congruence on a structure S is an equivalence relation ~ such that for every sym- 
bol R of arity n and all elements x%, . . . , x n , yi, . . . , y n of S: if x\ ~ y\, . . . , x n ~ y n , then 
R (x±, . . . , x n ) iff R s (yi, . . . , y n ). We say that a symbol is a congruence if its interpretation 
in the structure is a congruence. For a congruence ~ over a structure <S, we denote by S / ^ 
the quotient structure, i.e., the structure which has as elements the equivalence classes of ~, 
and the relations of which are the images of the relations on S under the canonical surjection 
induced by ~. 

Note that the operation of quotienting preserves the decidability of the first-order the- 
ory. For this reason we may wonder if constructing a structure by a finite sets interpretation 
followed by a quotient is more powerful than solely using a finite sets interpretation. The- 
orem 14.11 below shows that it is not the case when the original structure is a deterministic 
tree. 

Theorem 4.1. Given a finite sets interpretation X , there exists a finite sets interpretation I' 
such that for every deterministic tree t, if the symbol ~ is a congruence in T{t), then T'{t) 
is isomorphic toI{t)/^. 
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Proof. Let A be a nondeterministic parity automaton corresponding to the formula $^ 
describing ~ in X. This automaton works on t additionally annotated by a pair (X, Y) of 
sets of nodes. We say that the automaton reads (X, Y). 

For a prefix closed subset S of dom(t), we call the set of minimal nodes not in S its 
border. Let X be an element of the structure I(t), i.e., a finite set of nodes of t. We 
construct its shadow S(X) by taking the prefix closure of X and then adding all trees of 
height at most 2^1 — 1 that are rooted at the border of this prefix closure. So S(X) is 
the least set of nodes containing X that is closed by prefix and such that all element of its 
border is the root of a subtree of height at least 2^L Obviously, S(X) is finite. Now, let 
us consider an equivalence class c for ~. Define the shadow S(c) of the class c to be the 
intersection of the shadows of all the elements in the class. This is also a finite set of nodes 
of t. We define the border of the class, denoted by B(c), to be the border of its shadow. 
Note that the subtrees rooted at nodes from B(c) have height at least 2^1; indeed, this 
property is preserved when intersecting shadows. 

Given an element X, its description is the triple (B,Y,f), where B = B(c) for the 
class c of X, Y is X n S(c), i.e., X restricted to the shadow of its class, and / maps each 
node x £ B to the set of states q such that there is an accepting run of A on the subtree 
rooted in x starting with state q and reading (0, X). 

We claim that if two elements X and X' share the same description — say (B, Y, /) - 
then those elements are equivalent for ~. Using the transitivity of ~ and the finiteness of B 
it is sufficient to consider the case of elements coinciding everywhere but below one x in B. 
Note that X and X' coincide above B because they have the same description. Since x is 
in the border of the class of X, there is an element Z equivalent to X such that Z does 
not contain any node below x. Since Z is equivalent to X, there is an accepting run p 
of A on t labeled by (Z, X). We aim at constructing a run of A witnessing the equivalence 
of Z and X' , i.e., a run accepting t labeled by (Z, X'). This new run is constructed in the 
following way. On every element not below x, it coincides with p. This is a valid part of run 
since X and X' do coincide on this area. On the subtree rooted in x, Z coincides with 0. 
Hence, as (B, Y, f) is a description of X, p{x) belongs to f(x). But as the same description 
holds also for X' , there is a piece of run below x starting with p(x) and accepting (0, X'). We 
complete our new run by this piece of run. This new run witnesses as expected that Z ~ X' . 
It follows by symmetry and transitivity of ~ that X ~ X' . This concludes the proof of the 
claim. 

Let us remark now that a description (B,Y,f) can be encoded uniquely by a set of 
nodes: this set is BUYU Coding(f) where Coding(f) contains exactly one element for each 
element x of the border, and this element is located in a place uniquely describing the value 
of f(x) (e.g. the leftmost node at distance g(f(x)) below x, where g is a numbering of 2® 
starting from 1). This is possible since the subtree rooted in x is has height at least 2^1, 
and consequently, there is "room" below x for coding the information f(x). 

Note that associating to an element X the coding of its description is doable by means of 
a WMSO-formula. Note also that given a class c there is only a finite number of descriptions 
for the elements it contains. Hence we can choose the smallest description — smallest for 
a suitable total order — as unique representative for the class. A suitable total order can, 
e.g., use the lexicographically smallest node that is in the symmetric difference of the two 
sets coding the two descriptions. From here, it is not difficult to reconstruct I(t) j ^ . □ 

In combination with Proposition 13.11 we obtain the following. 
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Corollary 4.2. Tree- automatic structures are effectively closed under quotient. 

In the terminology of [Blu9 9] , this result is rephrased as "every tree-automatic structure 
admits an injective presentation." Let us remark that this result is announced in |Blu99] . 
but unfortunately the proof proposed there contains an unrecoverable error. 

4.2. Finite sets interpretations and powerset lattices. In this section we present our 
main result. For this, let S be a structure of signature ^. We say that S is a finite 
powerset lattice if it is isomorphic to (V F (E), C) for some set E, where V F (E) represents 
the finite subsets of E. Such a finite powerset lattice can be seen as a particular case 
of weak powerset generator applied to a vocabulary-free structure. We call the elements 
corresponding to singletons in this isomorphic structure atoms, i.e., those elements which 
have exactly one element strictly smaller with respect to ^. 

Theorem 4.3. For every finite sets interpretation 2 = (5(X),(j)^(X,Y)), there exists a 
WMSO- formula Code(X,x) such that, whenever for some tree t, I(t) is a finite powerset 
lattice, then Code(X,x) evaluates on t to an injection mapping the atoms ofX(t) to nodes 
oft. 

Section [6] is dedicated to the rather long and involved proof of this result. 
We rarely use the theorem in this form. We rather use weakened versions of it, namely 
Corollary 14.41 and Corollary 14.51 

Corollary 4.4. For every finite sets interpretation I, there exists a WMSO -interpreta- 
tion Z2 such that whenever for some structure S and some tree t, I(t) is isomorphic 
to V w (S) then l2{t) is isomorphic to S. 

Proof. If we remove all relations other than X, the weak powerset generator is nothing 
but a finite powerset lattice. Hence we can obtain a formula Code(X,x) by application of 
Theorem 14.31 It is then easy to transfer all relations defined on singletons to their image 
by Code. 

Formally, we define the WMSO-interpretation I2 = (5, . . . , &R t ) as follows: 

• S(x) = 3X.Code(X,x), 

• for each symbol R of arity r from the signature, &r(xi, . . . , x r ) is defined as 

T 

3Xi, . . . , Xr.VniXu . . . , X r ) A /\ Code(X uXi ) 

i=l 

where each is the WMSO-formula in X defining the interpretation of the sym- 
bol R. 

As Code maps each atom of Z(t) to a unique node, this interpretation indeed maps i to a 
structure isomorphic to S. □ 

A weaker, yet more readable formulation of the above corollary is provided in the 
following one. 

Corollary 4.5. If for a structure S and a tree t the class of structures that can be obtained 
by finite sets interpretations from S is contained in the class of structures that can be 
obtained by finite sets interpretations from t, then S is WMSO-interpretable in t. 

Proof. Assume that the hypothesis of the corollary holds. In particular, the structure V w {S) 
is isomorphic to I(t) for some finite sets interpretation I. By applying Corollary 14.41 the 
structure S is WMSO-interpretable in t. □ 
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5. Applications 

We present here several applications of the results above, ordered by level of complex- 
ity. The two first ones, showing that the free monoid is not obtainable by a finite sets 
interpretation of a tree (Section 15. ip and that a natural hierarchy of structures is strict 
(Section 15. 2p . are paradigmatic applications of Theorem 14.31 Section 15.31 establishes that 
the random graph is not finite sets interpretable in a tree, extending the known result for 
automatic structures [KNRS04] . Finally, in Section 15.41 we study intrinsically definable 
relations in generators of the automatic structures. 

Some of those results are known in the weaker context of automatic structures. We 
would like to underline here the fundamental methodological difference of our approach: in 
none of the applications below we perform a combinatorial analysis of finite sets interpre- 
tations. Instead, we systematically reduce the problem to a much easier one of WMSO- 
definability in trees. 

5.1. The free monoid. We consider the free monoid as a structure ({a, b}* , •, a, b) — the 
set of words over {a, b} together with the ternary relation corresponding to the concatenation 
and the two words a and b identified by unary predicates — and want to answer the question 
whether this structure is isomorphic to a finite sets interpretation of a tree. 

One should note that the FO theory of the free monoid is undecidable and hence we 
can directly conclude that it cannot be obtained by a finite sets interpretation from a tree 
with a decidable WMSO theory. However, this reasoning does not include trees with an 
undecidable WMSO theory. 

The negative answer we give here to the above question is the simplest and in some 
sense the purest application of the results presented above and should be considered for this 
reason as a key example. 

The following result was obtained in a discussion with Olivier Ly. 

Proposition 5.1. The free monoid over a two letter alphabet is not isomorphic to any 
finite sets interpretation of a tree. 

Proof. We first show how to obtain V w (N, +) from the free monoid by an FO-interpretation 
followed by a quotient. Then, assuming that our claim is false, we invoke the two results 
from the previous section and obtain a contradiction. 

Let / be the function which to each word of the form ba ni ba n2 b . . .ba nk b over {a, 6} 
associates the set of naturals {ni, ri2, • • • , Tik}. The domain of / is the set of elements 
satisfying dom{x) = By.x = byb. The relation of inclusion is also first-order definable: 
f( u ) Q f( v ) iff sub(u,v) holds with sub(u,v) = 

Vx G a* .By, z. u = ybxbz — > By', z r .v = y'bxbz' , 

where x G a* stands for V y, z.x ^ ybz . 

Let x ~ y be the formula sub(x, y)Asub(y, x). Naturally, for every u,v, u ~ v iff f(u) = f(v). 
Finally the addition over singletons is definable. More precisely, f(u) = {i}, f(v) = {j} 
and f(w) = {i + j} iff add(u,v,w) holds with add(u,v,w) = 

Bx G a* , y G a* .u ~ bxb A«~ byb Atu~ bxyb. 

Using those formulas, one can first-order interpret in the free monoid a structure which, 
when quotiented by ~, is isomorphic to V w (N, +). 
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Assume now that the free monoid can be finite sets interpreted in some tree t. Since 
structures obtainable by finite sets interpretations from t are closed under first-order in- 
terpretations (Proposition I2.4| ) and quotient (Theorem 14. ip . this implies that V (N, +) is 
finite sets interpretable in t. By Corollary S3] we deduce that (N, +) is WMSO-interpretable 
in t. This yields a contradiction since (N, +) is not WMSO-interpretable in a tree. Let us 
provide a direct argument for proving this last argument. 

Assume that (N, +) is WMSO-interpretable in t and let U denote the set of nodes of 
t that represent N in a corresponding interpretation. We first note that for each node of t 
at most one of its subtrees contains infinitely many elements from U. Otherwise, if the two 
subtrees of a node u contain both infinitely many elements from U, the successor relation 
on N (which is addition with one argument fixed to 1) would infinitely often jump between 
these two subtrees. If A is an automaton with n transitions accepting the successor relation, 
and if xq, . . . , x n G U are in the left subtree of u, yo, ■ ■ ■ , y n £ U are in the right subtree of 
u, and all Xi, yi are in the successor relation, then A also accepts a pair Xi, yj with i ^ j by 
a simple counting argument on the transitions used at u in accepting runs of A. 

By starting at the root and always proceeding to the unique subtree containing infinitely 
many elements from U we obtain an infinite branch B of t. 

Now let A+ be an automaton with n transitions accepting the relation + on t and let 
xo, . . . , x n G U. For each Xi there are infinitely many yi, Z{ such that the triple (xi,yi, z,j) is 
accepted by A+. Choose a node v on B such that none of the X{ is below v and for each i 
choose yi, Z{ as above that are below v. Counting the possible transitions that are used at v 
in accepting runs of A+ on the tuples (xi,yi,Zi) we obtain that A+ also accepts (xi,yj,Zj) 
for some i 7^ j. This gives a contradiction. □ 

5.2. A hierarchy of structures of decidable first-order theory. Caucal [Cau02| in- 
troduces a hierarchy of graphs/structures of decidable MSO-theory. The definition of this 
hierarchy that we use here differs from the original one and follows [CW03] and [Tho03]. 

Level consists of finite structures, and level n+1 is defined as the MSO-interpretations 
of the unraveling of graphs of level n. As both MSO-interpretation and unraveling are trans- 
formations preserving the decidability of the MSO-theory, each structure of this hierarchy 
has a decidable MSO-theory. In [CW03J, this hierarchy is shown to be strict. 

If in these definitions the MSO-interpretations are replaced by WMSO-interpretations, 
we obtain the same hierarchy. This can be deduced from a result in [CW03] stating that each 
graph of level n can be obtained from the unravelling of a deterministic graph of level n — 1 
by applying a so called inverse rational mapping. Such an inverse rational mapping can 
easily be described by a WMSO-interpretation. 

Furthermore, the unravelling of a deterministic graph yields a deterministic tree and 
on deterministic trees every WMSO-formula is equivalent to an MSO-formula. Hence, from 
the decidability of the MSO theory we also obtain the decidability of the WMSO theory 
for deterministic trees. In combination with the above mentioned result all graphs in the 
hierarchy have a decidable WMSO theory. 

From this hierarchy, it is easy to construct a corresponding tree-automatic hierarchy. 
The tree-automatic structures of level n are the image of the structures of level n of the 
Caucal hierarchy by finite sets interpretations. Let us denote the nth level of this tree- 
automatic hierarchy by TAuT n . Since the trees on the first level of the Caucal hierarchy 
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are regular, we can deduce from Proposition 13.11 that TAuti coincide with the class of 
tree-automatic structures. 

Furthermore, from Corollary 12.51 and the above considerations we can conclude the 
following decidability result. 

Remark 5.2. For each n, every structure in TAut„ has a decidable FO theory. 

A simple application of our result is the strictness of this tree-automatic hierarchy. 

Theorem 5.3. For each n > the class TAuT n of structures on the nth level of the 
automatic hierarchy is strictly contained in TAuT n+ \. 

Proof. We know that each level n of the Caucal hierarchy contains a tree generator G n , i.e., 
each structure of level n is WMSO-interpretable in G n |CW03] . Let us suppose that the 
automatic hierarchy collapses at some level n, i.e., TAuT n = TAuT n +i. This would imply 
that the structure V w (G n +i) can be obtained by a finite sets interpretation from G n . Then, 
by Corollary 14.51 we obtain G n+ \ as a WMSO-interpretation of G n and hence G n+ \ is in 
the nth level of the hierarchy. This contradicts the strictness of the Caucal hierarchy. □ 

5.3. Random graph. The random graph is a non-oriented unlabeled countable graph with 
the following fundamental property: for any two disjoint finite set of vertices E and F, there 
exists a vertex v that is connected to all the elements of E and to none of the elements of F. 
For the existence and basic properties of such a graph see, e.g., [Hod93]. We do not give 
in this work a more precise definition of the random graph. Anyhow, a direct consequence 
of the fundamental property stated above is that the random graph satisfies the quantifier 
elimination property (and this is effective). The decidability of its first-order theory follows. 

Since the random graph has a decidable first-order theory and since finite sets inter- 
pretations define a large number of structures also having this property, it is interesting to 
consider the question whether the random graph can be obtained by a finite sets interpre- 
tation from a tree. A partial answer to this question has been studied: one knows that the 
random graph is not isomorphic to any word-automatic structure [KNRS04J. 

In this section, we show that there is no tree from which the random graph can be 
generated by a finite sets interpretation. This proof was obtained in a discussion with 
Vince Barany. 

Theorem 5.4. The random graph is not finite sets interpretable in a tree. 

Proof. Heading for contradiction, let us assume that there exists a finite sets interpreta- 
tion Xr = (S(X), ^(X, Y)) and a binary tree tn such that 2/j(t#) is (isomorphic to) the 
random graph. 

The basic idea is to prove that, under this assumption, "the random graph is WMSO- 
interpretable in a tree". However, this statement is uncomfortable to handle since the 
properties of the random graph do only refer to its finite induced subgraphs. Instead, we 
show a similar interpretability result for every finite induced subgraph of the random graph; 
i.e. for every finite graph. Formaly we establish the following claim. 

Claim: There exists a finite sets interpretation X such that for any finite non-oriented 
graph G there exists a tree to such that X'(tc) is isomorphic to V w (G). 

Before we prove this claim, let us demonstrate how to use it to show Theorem 15.41 Let us 
apply Corollary 14.41 on the interpretation 2'. We obtain a WMSO-interpretation X" with 
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the property that I" {to) — G for any non-oriented unlabeled graph G and a suitably chosen 
tree to- As trees have bounded clique- width and WMSO-interpretations applied to a class 
of graphs of bounded clique-width yield also a class of graphs of bounded clique-width (see 
e.g. [Cou97j3), we obtain a contradiction to the fact that there exists non-oriented graphs 
of arbitrary high clique-width (the (n x n)-grids yield such a family of graphs, see [MR99J). 

Proof of the claim: Let us show first how we can encode any finite set of elements of Xr(£r) 
by a pair of finite subsets of dom(tji) in such a way that the membership relation is "defin- 
able". 

Let £ be a finite set of vertices of Each vertex of Xr(£r) is a finite set of 

nodes of tR and therefore it makes sense to define De as the union of all the elements in E. 
Furthermore, let us chose Ie to be an element of Xr(£r) which is connected to all elements 
of E and to none of the elements of V{De) \ E (such an element exists since Ir^r) is the 
random graph). From De and Ie one can easily reconstruct the set E. More precisely, 
let X be an element of Ir^r). Then X belongs to E if and only if tR models S(X) AXC 
D E A9(X,I E ). 

Let now G be a non-oriented finite graph. Since Zj?(£_r) is the random graph, the 
graph G appears as an induced subgraph of Ir^r) (cf. [Hod93]). Let V be the set of 
vertices of Ir^r) inducing this subgraph. 

For each subset F of V, one can construct an element vf of Xr^r) which is connected 
to all the vertices in F and to no vertex in V \ F. Knowing V, this element vf completely 
characterizes F. Let now V be the set of all the t^'s for all subsets F of V. 

Let to be the tree £r extended with markings describing Dy , Iy , Dy and Iyi . Using the 
trick mentioned above, we can define the formula X G V (similarly X G V) to be S(X) AX C 
Dy A ^(X,Iy). We now want to finite sets interpret V w (G) in to- Obviously, we can 
identify the elements of V w (G) with the elements of V. Pursuing this idea, we define 
the interpretation I 1 = (5'(X), *&'(X, Y), <&c(X, Y)) in the following way. The universe is 
defined by 5'(X) = X € V. The subset relation is defined by: 

$c(X,Y) = VZ G V.V(Z,X) -» V(Z,Y) 

Finally the edge relation is defined by: 

ty'(X,Y) = Singleton(X) A Singleton(Y) 

A 3X', Y' G V.V(X', X) A #(Y' , Y) A (^(X', Y')) 

where Singleton(Z) stands for Z G V A 31Y G V.^f(Y, Z) and 3! abbreviates "there exists 
one and only one" . 

Using the properties linking V and V, it is not difficult to see that T'(to) is (up to 
isomorphism) V W (G). Furthermore, T' does not depend on G. This finishes the proof of 
the claim and hence the proof of the theorem. □ 



^To be precise the result in [Cou97] applies to classes of finite graphs, whereas our trees to may be 
infinite. But given an interpretation that produces a class of finite graphs from a class of infinite trees one 
can modify the interpretation such that one obtains the same resulting class of graphs from a class of finite 
trees. 
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5.4. Intrinsic definability. Our last application of Theorem 14.31 concerns "intrinsic defin- 
ability" of relations. This notion is the natural adaption of the notion of intrinsic regularity 
for automatic structures [KRS04J. An automatic structure may have (up to isomorphism) 
several different presentations. These presentations can have different properties in the fol- 
lowing sense: It might be possible to add a relation to the structure that is regular in one 
presentation but is not regular in the other presentation. 

Consider, for example, the structure Aj, i.e., the natural numbers with the successor 
relation. One automatic presentation is to use binary encodings of the numbers but it is 
also possible to use unary encoding. If we now add the predicate "being a power of 2", i.e., 
the set {2 n | n E N}, then this predicate is certainly not regular for the unary encoding but 
it is in the binary encoding (it corresponds to the set of all words of the form 10*). This 
means that this predicate is not intrinsically regular for Ai because it is regular in some 
presentation but not in all. 

Accordingly, a relation is called intrinsically regular for a structure if it is regular in all 
automatic presentation of the structure. In [KRS04] this notion is studied and the question 
of a logical characterization of intrinsically regular relations is raised. 

In [Bar06| it is shown that for the structure ({0, 1}*, So, Si, C, el) (recall Example 12.31 
above) each relation is either intrinsically regular or intrinsically non-regular, i.e., either it 
is regular in every presentation or non-regular in every presentation of the structure. Such 
a result can be used as a tool to show that certain structures are not automatic, which is 
a difficult task (cf. [BG00J). If we add a relation to the above structure and show that 
it is not regular in the natural automatic presentation, then we know that the structure 
extended by this relation has no automatic presentation at all. 

In this subsection we show a stronger result for another structure. In terms of automatic 
structures we show that for V w (A\) each relation is intrinsically regular or intrinsically 
non-regular for every tree- automatic presentation of P (Ai). 

For this, we adapt the notions to our setting. That is, intrinsic definability considers 
relations that are definable in every possible presentation of a structure by a finite sets 
interpretation from a fixed tree t. If we, for example, fix this tree to be Ai, then this 
corresponds to intrinsic regularity for automatic structures. 

Note that, in contrast to the previous sections, we now explicitly consider the presenta- 
tions of elements of a structure, i.e., we distinguish different codings of the same structure. 

Definition 5.5. Given a structure T, a T -presentation of a structure S of universe U is 
an injection / from U to the finite subsets of T such that the set f{U) as well as the image 
by / of each relation R of S are WMSO-definable on T. That is, there is a formula (j)jj(X) 
over T defining the image of U under /, and for each relation R of S there is a formula 
4> R (X\, . . . ,X r ) using the signature of T such that for all u±, . . . , u r Eli, 

(ui,... ,u r ) E R iff T \=4> f R (f(ui),...,f(u r )), 
where r denotes the arity of R. 

Given such a T-presentation / of S, it might be possible to add relations to S such that 
/ is still a T-presentation of this extended structure. Such relations are called definable in 
f, i.e., R' is called definable in / if / is a T-presentation of S extended by the relation R'. 

Note that to a T-presentation / of a structure S we can directly associate a finite 
sets interpretation Zf sending T to S up to isomorphism (the isomorphism being /). An 
additional relation R' is definable in / if we can add to If a formula defining R'. 
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If S is the weak powerset structure V (T) of T, then there is a canonical T-presentation 
given by the identity mapping. We refer to this T-presentation as the standard presentation 
ofV w (T). 

The following lemma states that the "intrinsically definable" relations of V w (t) for a 
tree t are exactly those that are regular in the standard presentation of V w (t). 

Lemma 5.6. Let t be a tree and R be a relation over V w (t). Then R is definable in the 
standard presentation of V w (t) iff R is definable in all t' -presentations f of V w (t) for all 
trees t' . 

Proof. Obviously, if R is definable in all t'-presentations / of V w (t) for all trees t', then R 
is in particular definable in the standard presentation. 

For the other direction, let R be a relation of arity r that is definable in the stan- 
dard presentation of V w (t) and let &r(Xi, . . . ,X r ) be the defining formula, i.e., is a 
formula over the signature of t such that t \= $>r(Ui, . . . , U r ) iff (U\, . . . , U r ) € R for all 
Ui, . . . , U r C dom(t). According to Proposition 12.71 we can construct an FO-interpretation 
T\ such that Ii(V w (t)) is the structure V w (t) augmented with the relation R. 

Let now / be a ^-presentation of V w (t) and let If be the finite sets interpretation 
with T f (t') = V w (t). Then (2i ol f )(t') = l-y{V w (t)), witnessing the definability of R in 
the i'-presentation / of V (t). □ 

The dual of Lemma 15.61 where definable is replaced by not definable, is not true in 
general. This is already the case for instance for t = t' = A2, i.e., there is a relation R 
which is not definable in the standard presentation of V w (A2) but is definable in some 
A2-presentation of V w (A2). 

Consider, for example, the relation R(x, y) that holds if x is on the leftmost branch, 
y is on the rightmost branch, and x and y are on the same level. It is not difficult to see 
that this relation is not WMSO-definable in A2. Hence, if we transfer R to singletons on 
V w {A2), it is not definable in the standard presentation of V w '(A2). 

On the other hand, one can find a WMSO-interpretation I2 with ^2^2) ~ (A2,R). 
The finite sets interpretation V w o I 2 defines a A2-presentation of V w '(A2) in which R 
(transferred to singletons) is regular. The construction of I2 is not very difficult. It suffices 
to redefine A2 inside itself such that the corresponding vertices of the leftmost and the 
rightmost branch are located close to each other, as for example done by the following 
mapping (where w is any non-empty word over {0, 1}): 

Q n ^ Q n ; r ^ n lf Q n lw ^ QU^^ ^ Q n 1Qw 

The reader can verify that the two successor relations and the relation R are WMSO- 
definable on this coding of A2. 

However, in the particular case of t = Ai and t' = A2 such a converse of Lemma 15.61 
does hold as expressed in the following theorem. 

Theorem 5.7. If R is definable in some A2 -presentation f of V w (Ai), it is definable in 
every A2-presentation ofV w (Ai). 

As we can expect, by application of Theorem 14.31 we wm obtain a WMSO-interpre- 
tation sending A2 to Ai. The two following lemmas study this kind of interpretations and 
how they preserve definability. 
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Lemma 5.8. Let Z2 be a WMSO-interpretation sending A2 to A± and /2 be the injection 
from N to {0,1}* witnessing the isomorphism. Then there exists naturals m, n > and 
words u, v, wo, . . . , w n -\ £ {0, 1}* such that for all naturals k > and p € {0, . . . , n — 1}, 

/2(m + kn + p) = uv k w p . 

Proof. The general idea behind the proof is that the elements from the image of fi cannot 
be spread arbitrarily in A2 because the successor relation from Ai has to be definable in 
WMSO and hence must be recognizable by an automaton. 

We denote by U C {0, 1}* the image of f?, and by V the closure of U by prefix. The set 
V defines a subtree of A2. We now augment V by markings containing information on the 
successor relation in such a way that these markings are definable in WMSO. Hence, the 
marked tree is regular, i.e., it has only finitely many non-isomorphic subtrees (see [Tho97 
for more information on regular trees). From this regular tree we can define the words 

U,V,W , . . . ,W n -l. 

Let $ 

succ{ x iy) be the formula of I that defines the successor relation succ of Ai, and 
let Asucc be the equivalent tree automaton. 

As succ is deterministic, one can easily show that V contains only one infinite branch 
B. Otherwise, the relation succ has to jump infinitely often between two infinite branches 
leading to a contradiction as A SUC c would also accept pairs of nodes that are not in succ. 
The argument is the same as in the proof of Proposition 15.11 where it is shown that (N, +) 
is not WMSO-interpretable in any tree t. Furthermore, this branch B is WMSO-definable. 

Consider two nodes x,y € U such that & auC c(x, y) is satisfied, i.e., f% (y) is the successor 
of /^(x). We can describe how to get from x to y by a pair of words (z x , z' x ) over {0, 1} 
meaning that x = x'z x and y = x'z' x for the greatest common ancestor x' of x and y. Again, 
using the determinism of succ one can show that the length of these words z x ^ z x is bounded 
by some constant derived from the size of A SUC c- Hence, we can mark the vertices from U 
by this information (using sets X z>z i with x £ X ZtZ i iff (z x ,z' x ) = (z,z')). Obviously, this 
marking is WMSO-definable. 

The last information that we attach to V is for each node x € B the word b x € {0, 1}* 
such that the node xb x is the smallest node in U (smallest referring to the position in Ai) 
such that all nodes bigger than xb x are below x, i.e., for all y £ U, if f^ixbx) < / 2 ~ 1 (y), 
then x C y. The length of these b x is bounded because the relation that associates to each 
x the node b x is WMSO-definable and deterministic. Hence, the marking of the nodes in B 
by using sets Xj, with x € Xf, iff b x = b is WMSO-definable. 

The resulting tree t, consisting of the nodes in V with the markings described above is 
WMSO-definable and hence regular. Let u,v € {0, 1}* such that u,uv G B and the subtrees 
of t rooted at u and uv are isomorphic. Let m = / 2 _1 (u6 u ), m! = / 2 _1 (ut'6 u „), and define 
n = m' — m. For p £ {0, . . . , n — 1} let w p € {0, 1}* be such that + p) = uw p . By the 
choice of m, such a w p always exists. In particular, wo = b u . 

By the choice of v , we know that f2(m + kn) = uv k WQ for all k > 0. Furthermore, as t 
is marked by the information on how to get from one node in U to its successor, we know 
that the ways to get from fiijn + kn + p) to f^ijn + kn + p + 1) are the same for all k. 
Hence, ^(m + kn + p) = uv k w p . □ 

Lemma 5.9. Let Z2 be a WMSO-interpretation sending A2 to Ai and fi be the injection 
from N to {0, 1}* witnessing the isomorphism. If R is a relation over finite subsets o//2(N) 
WMSO-definable in A2, then its inverse image under fi is WMSO-definable in Ai. 
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Figure 2: Regular subtree of A2 induced by the domain of the interpretation Z2 from Lem- 
mas EH] and E21 The part from u to v is iterated. 

Proof. The formula *$>r(Xi, . . . , X r ) defining R can be represented by a tree automaton .Ar 
with state set Qr. Using Lemma f5,8t this automaton can be simulated by an automaton A' R 
on Ax as we show in the following. On Aj automata and WMSO have the same expressive 
power and hence the construction of A' R suffices to prove the lemma. 

Let u, v, wo, . . . , w n -\ be as in Lemma I5T81 The states of A' R correspond to partial runs 
of Ar on a finite subtree of A2 that 

• is rooted at e and induced by the elements /2(C)), .. . , fziyn — 1) for the first segment 
of Ax, 

• rooted at uv k and induced by the words v , wq, . . . , w n -\ for the following segments 
of Ai. 

The corresponding parts of A2 on which A' R has to simulate a run of Ar are depicted in 
Figure [2] (for n = 2 and m = 3). 

For the formal definitions, let Uq be the set of all nodes that are prefix of u or of some 
72(2) for < % < m, and let U\ be the set of all nodes that are prefix of v or one of 
wo, . . . ,w n -\. The set Uq corresponds to the upper finite tree in Figure [2] surrounded by a 
dashed line, and the set U\ to the finite tree rooted at u. 

The automaton A' R reads a word a € ({0,l} r ) w and has to decide if this labeling 
transferred by /2 to A2 corresponds to a tuple of sets in R. To simulate a run of Ar it 
guesses partial runs, starting with a partial run on Uq, and then continuing with U\, the 
periodic part of the tree. 

More formally, it starts by guessing a pair {po,Xo) with mappings po ■ Uq — » Qr and 
■^0 : {/2(0), • • • , f^m — 1)} — > {0, l}' r such that po corresponds to a partial run of Ar on 
Uq with labels corresponding to Ao- In the next steps, A' R verifies if the guessed labeling is 
correct, i.e., if a(i) = Xo(f2(i))- When reaching position m, A' R guesses a new pair (pi, Ai), 
now with mappings pi : U\ — ► Qr and Ai : {wo, . . . , io n -l} — * {0, 1} T such that pi is a 
possible continuation of po on the subtree rooted at u that is shown in Figure [2] and labeled 
according to Ai . The guessed labeling is again verified on the segment m, . . . , m + n — 1 and 
then a pair (p2, X2) of the same type as (pi, Ai) is guessed, and so on. The automaton accepts 
if the concatenation of the guessed partial runs on the path uv u satisfies the acceptance 
condition of Ar. For this to work, the guessed partial runs have to be such that they can 
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be continued to accepting runs on the "blank parts" of A2, i.e., those infinite subtrees that 
do not contain a node from the image of / 2 . □ 

Using this Lemma we can prove Theorem 15.71 



Proof of Theorem 5.7 Assuming a A2-presentation / of P (Ai) and X the corresponding 
finite sets interpretation, one obtains by Corollary 14.41 a WMSO-interpretation Z 2 with 
Z2(A2) = Ax- Let / 2 : N — > {0, 1}* be the injection witnessing this isomorphism and let / 2 
be its extension to sets. 

We now have two ways to obtain isomorphic copies of V w (Ai) from A2: by applying I 
and by applying Z 2 followed by V w . These two ways yield isomorphic structures and hence 
there is an isomorphism h sending 2T(A 2 ) to V w (X 2 (A 2 )) • We obtain the following picture, 
where dashed arrows represent interpretations, while normal arrows are for isomorphisms: 



/ 



J(A 2 ) 
i 

X 

A 2 - 



V W (1 2 (A 2 )) 
\<pW 




.^ 2 (A 2 



Ai 



We show that we can define the isomorphism h on A2 by a WMSO-formula. To understand 
this, consider a finite set U of naturals, i.e., an element of V (Ai). This set U corresponds 
to two sets X and Y of nodes of A 2 , namely X = f(U) and Y = / 2 (£/). The isomor- 
phism h relates these two sets, i.e., h(X) = Y. This relation can be defined in WMSO 
using the formula Code that we obtain from Theorem 14.31 and that is used to construct 
X 2 in Corollary 14.41 The formula Code(X,x) relates each subset of A 2 that is an atom in 
V w (A.\) = Z(A 2 ) to a single node x of A 2 . Assume that the set X represents the atom {n}. 
Then the unique x such that Code(X, x) is satisfied represents n in Z 2 (A 2 ) = Ai (by the 
construction of T 2 in the proof of Corollary I4.4j) . and the singleton {x} represents the atom 
{n} in 7 3M/ (T 2 (A 2 )). We get that h(X) = x, i.e., the formula Code defines the isomorphism 
h on the level of atoms. It is easy to extend this to sets: 

4> h {X,Y) = Vy(y G Y «-» 3Z($c(Z,X) A Code{Z,y))). 

Let now R be a relation over V (Ai) which is definable in /. This means that f(R) 
is WMSO-definable in A 2 . Since h is WMSO-definable, it follows that h(f(R)) is also 

WMSO-definable in A 2 . Finally, by Lemma E3 we obtain that f 2 ~ 1 (h(f{R))) is WMSO- 
definable in Ai. Since / 2 o h o f is obtained as a composition of isomorphisms, it is 
an automorphism of V w '(Ai). Remark now that the identity is the only automorphism 

of V w (A\) (a property inherited from Ai). It follows that / 2 (h(f(R))) equals R. And 
consequently R is definable in the standard presentation. By Lemma 15.61 it is definable in 
every presentation. □ 
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6. Proof of the main result 

The proof of Theorem 14,31 is rather complex and split into several parts. In Subsec- 
tion [6]TJ we introduce the key notions used afterwards while we make the scheme of the proof 
more precise. This will also be the occasion for explaining the content of Subsections 16.21 
16.31 16.41 16.51 In Subsection 16.61 things are put together and the proof of Theorem 14.31 is 
finally given. 

6.1. First definitions and presentation of the proof. We assume from now that a 
finite sets interpretation X = {8(X), <f><(X, Y)) is fixed. Along the whole proof we use a 
tree t together with a set E and the isomorphism / that are assumed to satisfy the equality 

f(V F (E))=l(t) . 

The reader must keep in mind that none of the constructions we perform makes use of t, 
E, or /. Hence, the result will hold for any such tree, set, and isomorphism. This lightens 
the presentation of the proof by avoiding to systematically quantify over those objects. 

We consider the set Atoms of finite subsets of t representing atoms of the powerset 
lattice, i.e., 

Atoms = {f({u}) : u € E} . 

The set Atoms can be defined as the set of finite subsets of t which are minimal — for 
the (j)-^ formula seen as an ordering — and distinct from the minimal element itself (which 
is /(0)). This description can be done in weak monadic second-order logic. Hence Atoms 
is regular in t and there exists an automaton 

■A Atoms =(QAt omsi Q Atoms' ^ Atoms j ^ Atoms) 

accepting the language Atoms. We also consider the binary relation Mem over 1{t) defined 
as the image under / of the G relation in V (E), i.e., 

Mem = {(/({«}), f(V)) : u 6 V C E and V finite} . 

This Mem relation is also definable in weak monadic second-order logic, and consequently 
is regular. We fix 

•A-M em = {QMemiQiMemi^Memi^Mem) 

to be an automaton recognizing the relation Mem. 

Recall that the theorem we want to prove claims the existence of a formula Code (X, x) 
such that the corresponding relation is an injection from Atoms into dom(t). 

Our goal in the construction of Code is to uniquely attach to each X in Atoms an 
element in dom(t) in a WMSO-definable way. As a first approximation, in Subsections 16.21 
16.31 and l6.41 we define a mapping Index which assigns to each X in Atoms a node in dom(t). 
Though the Index mapping is not in general an injection from Atoms into dom(t), it does 
not either concentrate a lot of indices in the same area of the tree t. Formally, if we set D{x) 
for x G t to be the cardinality of Index^ 1 (x), then by Lemmas 16.61 and 16.121 D happens to 
be a sparse distribution (see Definition 16.11 below) . Subsection 16.51 is dedicated to the study 
of sparse distributions. The central lemma of this part, Lemma [6. 17) establishes that, given 
elements concentrated according to a sparse distribution, we can uniformly redistribute 
them in dom(t) in a unique WMSO-definable way. Applied to our case, this means that 
the Index mapping can be transformed into an injection by use of WMSO-formulas. And 
this last step is used in Subsection 16.61 for terminating the proof of Theorem 14.31 
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The key definition connecting the two main parts of the proof (definition of the Index 
mapping and turning it into an injection) is the notion of sparsity. This definition requires 
the notion of zone. A zone Z in t is a connected — where t is seen as an non-oriented graph 
— subset of dom(t). That is, Z contains a minimal element w.r.t. to the prefix ordering, and 
whenever x C y C z for x, z in Z, then also y G Z. A zone Z is completely characterized by 
its least element x, and by the minimal elements x±, . . . , x n that are below x and not in Z. 
The elements {x, x\, . . . , x n } are called the frontier of the zone. Given nodes x, x±, . . . , x n 
of t such that the Xi are pairwise incomparable and xCij for all i, we define Nf ' Xl '~" ' n to 
be the set of nodes y such that x Q y and x% % y for all i G [n] where [n] denotes the set 
{1, . . . n}. By construction, N t ' ' n is the only zone which has frontier {x, x\, . . . , x n }. 

Definition 6.1. A distribution D is a mapping from dom(t) to N. For Z a finite zone, 
D{Z) stands for Ylx&z D(x). A distribution D is K-sparse for some K G N if for every 
finite zone Z of frontier F, D{Z) < \Z\ + A distribution is strongly K-sparse if for 

any finite zone Z of frontier F, D{Z) < K\F\. 

Sparsity tells us that no finite zone contains more indices than its size plus a factor 
linearly depending on the size of the frontier. 

6.2. Important nodes. In order to construct the mapping Index, given an element X 
of Atoms, we first define the set I{X) C dom(t) of its important nodes via combinatorial 
constraints. Essentially, we try to locate the places where "important coding decisions" are 
made by the automaton A Atoms when reading X. In the present Subsection, we provide the 
key combinatorial lemmas concerning important nodes. 

Then, depending on the shape of the set I(X) two cases are separated and two distinct 
definitions of Index are given. The first kind of index is called standard index — noted 
Slndex(X) — and is the subject of Subsection 16.31 The other kind is called branch index 
— noted Blndex(X) — and is the subject of Subsection 16.41 

Let us first introduce a convenient notation for studying the behavior of the automata 
•A-Atoms and Auem over zones: For a zone Z = jy*> x i'-' x n ana ; states q,q\, . . . ,q n £ Q Atoms-, 
we denote by Atoms(q, Z,q±, . . . , q n ) the set of all X C Z such that there exists X' C dom{t) 
with 

• X = X' HZ, and 

• X' is accepted by A Atoms with a run p such that p[x) = q and p(xi) = qi for all 
i G [n]. 

Similarly, for q, q\, . . . , q n G Quem we denote by Mem(q, Z, q\, . . . , q n ) the set of all pairs 
(X,Y) with I,7a such that there are X',Y' C dom\t) with 

• X = X'CiZ,Y = Y'CiZ, and 

• (X' , Y') is accepted by Auem with a run p such that p(x) = q and p[xj) = qt for all 
i G [n]. 

The definition of important nodes is then the following, where the constant K lia is 
chosen to make the combinatorial arguments in the subsequent lemmas work. 

Definition 6.2. Let K ira = (2\Q Me m\ + l)\QAt ms\- Given X G Atoms, a node x G dom(t) 
is called important for X if 

\{Y C Nf : (X - Nf) U Y G Atoms}\ > K im . 

We denote by I(X) the set of important nodes for X. 
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Hence a node x is important for X if there are many — i.e., more than K im — ways 
to modify X below x while remaining in Atoms. Intuitively, without knowing how X looks 
like below x, we cannot say much about which atom is coded because there are too many 
possibilities left. Remark that the set I(X) is by definition prefix closed. The fundamental 
property that we show in Lemma 16.41 is that for an important node x of J, the part of 
X that is not below x comes from a set of small size. To prove this lemma we need its 
combinatorial core stated in the following lemma. 

Lemma 6.3. Let K c = 2\QMem\ + 1- For any two disjoint zones Z and Z' of respective 
frontiers F = {x, x\, . . . , x n } and F' = {x 1 , x[, . . . , x' m }, and all accepting runs p of A Atoms 

either \ Atoms (p(x), Z, p{x\), p(x n ))\ < K c , 

or | Atoms {p(x '), Z' , p(x' 1 ), p(x' m ))\ < K c . 

Proof. It is sufficient for us to prove the result for two complementary zones. This comes 
from the fact that increasing a zone also increases the number of possible projections w.r.t. 
a fixed run, i.e., \ Atoms (p(x), Z, p(x±), p(x n ))\ < \ Atoms (p(y), Z" , p(yt), p(y £ ))\ for 
a zone Z" of frontier {y, yi, . . . , yi} with Z C Z". Hence, we will assume Z to be N^' x and 
Z' to be Nf for some node x. 

Assume that for some K > K c we have distinct sets X\, . . . , Xk in Atoms(p(e), Z, p{x)) 
and distinct sets X[, . . . , X' K in Atoms(p(x), Z'). Then, for every i, j £ [K], let Y{j be Xi U 
X'-. As the union of Z and Z' gives the whole domain of t, we have Yij £ Atoms for all 

i,ie[4 

Let us now consider the set Comb of possible combinations of the Yij, combination 
in the sense of the relation Mem. More precisely, A C dom(t) is in Comb if whenever 
(Y, A) £ Mem holds for some atom Y, then Y = Yij for some i,j. The cardinality of Comb 
is 2 k2 . We now show by a combinatorial argument that .4 Mem cannot distinguish all the 
elements from Comb because the amount of information that can be passed between the 
two zones Z and Z' is limited by the number of states in Quem- 

For this purpose, we define for each A £ Comb, fA ■ [K] x Quem — ► {0, 1} and qa ■ 

QMem X [K] {0, 1} by 



/.A («',?) 



9A(q,j) 



1 if (X i ,AnZ)£Mem(q i » em ,Z, q ), 

else, 

'l if (X'pA Pi Z') G Mem(q, Z'), 

else. 



It is obvious that if two sets A, B G Comb are such that /a = fs and gA = 9b, then 
(Yij,A) G Mem iff (Y{j,B) G Mem for all i,j G [K]. This means, by definition of Comb, 
that A = B. 

However, there are only 2 2 ^ Mem ^ K different possible values for the pair /a,5a- Hence 
we obtain \ Comb\ < 2 2 ^ Mem ^ K . This contradicts \ Comb\ = 2 R2 . □ 

The following lemma shows that the possibilities to code an atom 'above' an important 
node are bounded. 

Lemma 6.4. For each node x we have \{X n N^' x : X G Atoms and x G I(JT)}| < K ira . 
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Proof. We are aiming at a contradiction to Lemma 16.31 for Z = N^' x and Z' = Nf . 

For each X with x G I(X) there are more than K ira many Y C Nf such that Xy x '■= 
{X — Nf) U Y is in Atoms. Since K[ m = \QAtoms\ ■ (with if c from Lemma l6.3p . we can 
choose a state £ Q Atoms such that more than K c of these Xy jX are accepted by AAtoms 
with a run that labels x with qx,x- This means that | Atoms (gx an Nf)\ > -ftT c - 

Now, assume that there are K ira different X G Atoms with x G -^(-^0 that differ on 
iV^' 1 . Then there are at least K c such sets Xi, . . . ,Xk c with gx^a = • • • = qx K ,x ='■ Q- 
In particular, we obtain \Atoms(q 1 ^ toms ,Nf,q)\ > K c . Together with \Atoms(q, Nf)\ > K c 
from above we obtain the desired contradiction. □ 



6.3. Standard indices. We now address the problem of computing Index(X) for some 
atom X under the assumption that I(X) is not an infinite branch (the case when I(X) is 
an infinite branch is treated in Subsection 16.41) . Since we call this case the standard case, 
we will denote the index defined for such atoms X by Slndex(X). The simplest case is that 
I(X) is totally ordered by C, i.e., I(X) is a finite path starting from the root. We simply 
define SIndex(X) to be the last node on this path. The other case corresponds to I(X) 
not being a finite path nor an infinite branch. This corresponds to I(X) not being totally 
ordered by C. In this situation, we define Slndex(X) to be the first node at which I(X) 
splits into two paths. Those two cases are unified in the following definition. 

Definition 6.5. For X G Atoms such that I{X) is not an infinite branch, the index of X, 
written Slndex(X), is the maximal element in I{X) which is comparable to every element 
in I(X). 

As already mentioned, the intention of this definition is that Slndex(X) roughly locates 
in the tree where the main information concerning the atom coded by X lies. This location 
is far from being precise, and many elements of Atoms may have the same index. However, 
we will see that it is possible to obtain a good understanding of the repartition of the 
standard indices. The following lemma gives precise bounds on the quantity of indices that 
may occur in a zone, i.e., it states that the distribution assigning to each node x the number 
of X such that Slndex(X) = x is sparse. 

Lemma 6.6 (sparsity). There is a constant K s such that \SIndex~ (Z)\ < \Z\ + for 
every finite zone Z of frontier F. 

Proof. Denote the elements of the frontier of Z by x and x\, . . . ,x n , i.e., Z = jy^' 211 ''""' 31 ™. 
The proof of the lemma consists of two steps. We first show that for atoms X such that 
Slndex(X) is inside Z, the amount of information located outside Z is bounded. More 
precisely, we first show for M := K im ■ \Q Atoms \ 

(a) \{X n N{' x : X G Atoms and Slndex(X) G Z}\ < M and 

(b) \{X n Nf* : X G Atoms and Slndex(X) G Z}\ < M for all i G [n]. 

For (a) note that from Slndex(X) G Z, the definition of SIndex, and the prefix closure of 
I(X) we obtain that x G I(X). Therefore, 

{X n Nt' x : X G Atoms and Slndex(X) G Z} 

C{Xn Nt> x : X G Atoms and x G I(X)} 

and (a) follows from Lemma 16.41 
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For (b) we show that for each X G SIndex (Z) and each x% there is a state q £ Q Atoms 
such that X D N* 1 G Atoms(q,Nf i ) and | Atoms (g, A^)| < K im . From this, (b) follows 
because each X n A^ 1 comes from one of at most |Qyitoms| many sets of size less than K im . 
We distinguish two cases. 

If Xi £ I(X), then we take q to be the state at Xj in an accepting run of A Atoms on X. 
From the definition of I(X) we immediately obtain the desired property. 

Else, if Xi G I(X), by definition of standard indices there must some y G I(X) incom- 
parable to Xi, the index of X being the deepest common ancestor of Xi and y. From the 
definition of important nodes for X and from K ira = \Q Atoms] ' Kc we obtain that there 
exists a set Y C N* such that (X - Nf) U Y is in Atoms and is accepted with a run of 
•A.Atoms that labels y by a state q' such that \Atoms(q' , Nf)\ > K c . Let q be the state 
assumed at node Xi by this run. From Lemma 16.31 applied to the zones Nf\ Nf, and to 
the aforementioned run, we can conclude that \Atoms(q, N* l )\ < K c . The desired property 
follows from K c < K im . This finishes the proof of (b). 

After these preliminary considerations, we come back to the claim of the lemma. We 
denote the elements from {X n N^' x : Slndex(X) G Z} by X 1 , . . . , X M and the elements 
from {X n N Xi : SIndex (X) G Z} by Xj, . . . , Xf 1 (the same element can be represented 
more than once, what is important is that all elements are represented). 

Now, consider the set Comb of all combinations of atoms from SIndex^ 1 (Z) (in the same 
sense as in the proof of Lemma 16. 3p . A combination A G Comb is entirely characterized by 
the following objects 
• the set An Z, 
the mapping f A>x : [M] x Q M em -> {0, 1} with 



fA, x {j,q) 



1 if (P.inOGMem^,^ 

else, 



• and the mapping f A , Xi ■ QMem X [M] -> {0, 1} with 

Jl if (xf,AnN^)EMem(q,N^), 

/W9.JJ-| else _ 

Thus, |Com6| < 2l z l • 2 M l Qw « n | . rj" =i 2lQM e m]M_ For Kg = \Q Mem \M we obtain 

2 \sinde X -\z)\ = \ Co mb\<2^ F ^ 
and hence \SIndex~ 1 (Z)\ < \Z\ + K S \F\. □ 



6.4. Treatment of infinite branches. It is possible that for some X G Atoms the set 
I(X) of important nodes is an infinite branch. For these X we also develop a notion of 
index, called Blndex(X), and show that the distribution obtained in this way is strongly 
sparse. Since the sum of a if-sparse distribution and of a strongly i^'-sparse distribution is 
a K + .fT'-sparse distribution, we can add the indices corresponding to infinite branches to 
the other indices without affecting the sparsity of the induced distribution. 

In this subsection, we call the infinite branches that are equal to I{X) for some atom X 
important branches. We start with the helpful observation that the number of elements of 
Atoms corresponding to the same important branch is bounded. 

Lemma 6.7. For every important branch B, \I~ l (B)\ < K ira . 
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Proof. If there are K\ m different sets in I~ 1 (B), then we can pick a node x on B such that 
all these sets differ on the zone N^' x . Since x is important for all X in I^ 1 (B), we obtain a 
contradiction to Lemma 16.41 □ 

Our goal is to associate to every important branch B a node Vlnd(B) on B such that 

(1) at most Ki m branches are mapped to the same node by VInd, and 

(2) if some Vlnd(B') is above Vlnd(B), then Vlnd(B) is not in B' . 

Those two properties are established in Lemma 16.111 Then Lemma 16.121 uses those two 
properties for concluding that VInd o I has a strongly sparse distribution. 

Our main tool for constructing VInd is to produce a well-founded order for branches. 
For this, we define Rlnd(B) for every important branch B by 

Rlnd(B) = mm{x G B : 3X C N e t ' x , I(X) = B}. 

Since we consider finite sets interpretations, Rlnd(B) is always defined. Lemma [6741 applied 
to the node Rlnd(B) directly leads to the following lemma. 

Lemma 6.8. For all nodes x, \RInd~ 1 (x)\ < K lUi . 

The well-foundedness argument announced above is then the following. 

Lemma 6.9. For every important branch B, there are finitely many important branches B' 
such that Rlnd(B') C Rlnd(B). 

Proof. One has that Rlnd(B') C Rlnd(B) iff B' belongs to U x ^ RInd ^RInd^ 1 (x). This set 
is finite by Lemma 16.81 □ 

Now, we can define for a branch B the index Vlnd(B) as being the first node in B 
below Rlnd(B) which is not lying on an important branch strictly inferior with respect to 
comparing the Rind values. Formally 

Vlnd(B) = min{x G B : Rlnd(B) C x, VB'. Rlnd(B') C Rlnd(B) -> x <£ B'}. 

This definition is sound thanks to Lemma l6.91 Furthermore, VInd and Rind can be related 
in the following way. 

Lemma 6.10. Vlnd(B) C Vlnd(B') implies Rlnd(B) C Rlnd(B'). 

Proof. Assume VInd{B) C Vlnd(B'). Since Vlnd(B') € B' we obtain Vlnd(B) G B'. As 
by definition Rlnd(B) C Vlnd(B), we also have Rlnd(B) G B' . Consequently Rlnd(B) 
and Rlnd(B') lie on the same branch B' , and thus are comparable. For the sake of contra- 
diction, suppose Rlnd(B') C Rlnd(B), then by definition of VInd we obtain Vlnd(B) ^ B'. 
Contradiction. The remaining case is the expected Rlnd(B) C Rlnd(B'). O 

We are ready to establish the two properties wanted for VInd. 

Lemma 6.11. The following holds. 

(1) For every node x, \ VInd~ 1 (x)\ < K im . 

(2) If Vlnd(B') C Vlnd(B), then Vlnd(B) B' . 

Proof. (1): Let B be an infinite branch such that Vlnd(B) = x and let y = Rlnd(B). By 
Lemma l6.10| important branches with the same VInd also have the same Rind and hence 
Vlnd^ 1 (x) C RInd^ 1 (y). The desired bound follows from Lemma 16.81 

(2): By LemmaETmi if Vlnd(B') C VInd{B) then Rlnd(B') C RInd{B). HRInd{B') C 
Rlnd(B), the claim follows by definition of Vlnd(B). The case Rlnd(B') = Rlnd(B) is not 



28 



T. COLCOMBET AND C. LODING 



possible since this would imply that Vlnd(B') = Vlnd(B) because they are both lying on 
the branch B according to the assumption Vlnd(B') C Vlnd(B). □ 

Now, for X G Atoms such that I(X) is an infinite branch we define Blndex(X) to be 
VInd{I{X)). The distribution induced by BIndex is strongly sparse: 

Lemma 6.12 (strong sparsity). For Z a finite zone of frontier F , \BIndex~ 1 (Z)\ < Kf ra \F\. 

Proof. Assume that Vlnd(B), Vlnd(B') G Z for two branches B and B' . If the two branches 
exit Z at the same point, i.e., if B PI jP = B' PI F, then B and B' do not differ inside Z. 
As Vln djB), Vlnd(B') G Z we conclude that Vlnd(B) G B' and Vlnd(B') G B. Applying 
Lemma EH] (2) yields Vlnd(B) = Vlnd(B'). 

According to Lemmas 16.71 and 16.111 (1), the number of atoms X that share the same 
value BIndex (X) is bounded by K? . This shows that for each exit of the zone Z there are 
at most K? branches B with Vlnd(B) G Z leaving Z through that exit. Hence we obtain 
that \BIndex~ l {Z)\ < K?JF\. □ 



6.5. Car parking. In this subsection, our goal is to spread the indices around the tree such 
that each index ends in exactly one position. This can be seen as parking vehicles. In the 
beginning there are cars (indices) placed in the nodes of the tree, possibly more than one at 
the same position, and we aim at parking each of them in one node, i.e., attaching a single 
node to each of those cars. This is obviously not possible in general but we shall prove that, 
under a sparsity constraint on the distribution of vehicles, it is possible to attach a single 
parking place to each car, and furthermore that the mapping that, given a car, tells where 
to park it, can be described by a WMSO-formula. 

For this purpose, we have to describe distributions and other kinds of mappings that 
involve integers in their domain or image by WMSO-formulas. These integers will always 
be bounded by some constant K and hence we can split the WMSO-definition into several 
formulas, one for each number that may be involved. Formally, we say that a relation 
R C dom(t) r x / for some finite / C Z is WMSO-definable if there are WMSO-formulas 
(j>i(xi, . . . , x r ) for each i 6 / such that (u%, . . . , u r ,i) G R iff t, ui, . . . , u r \= <j>i{x\, . . . , x r ). 
Note that a -KT-sparse distribution can be seen as a relation of this kind since < D(x) < 3K 
for every x G dom(t). 

Definition 6.13. Given a distribution D, a placement P for D is an injective partial 
mapping from dom{t) x N to dom{t) such that P(x,i) is defined iff i G [-D(sg)]. 

A flow, defined below, can be seen as a kind of instruction on how to spread the values 
of a distribution to obtain a placement. In the vehicles description, this is the number of 
cars which will have to cross an edge in order to reach the final placement. 

Recall that, for simplicity reasons, we assume that all the nodes of a tree t have either 
2 successors or no successors, i.e., for all nodes u we have uO G dom(t) iff ul G dom(t). 
This assumption is not essential but simplifies the definitions and allows to avoid case 
distinctions. 

Definition 6.14. A flow is a mapping / from the nodes of t to Z. A flow / is compatible 
with a distribution D if for all inner nodes x, D(x) + f(x) < 1 + f(xO) + f(xl) and for 
every leaf x, D(x) + f{x) < 1. A flow / is bounded by a constant K if |/(x)| < K for each 
node x. 
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In this definition, f(x) is interpreted as the number of cars crossing the edge from 
the ancestor of x to x. In case of a negative value, —f(x) cars are driving from x to its 
ancestor. The condition of / being compatible with D states that after distributing all the 
cars according to the flow there is at most one car remaining at each node. One should note 
here that, according to our definition, it is possible that /(e) < 0. With the above intuition 
this would mean that one has to send —/(e) cars to the (non-existing) ancestor of the root. 
We need this case when constructing flows on finite subtrees of a given infinite tree. 

In the following we show that for a fT-sparse distribution there is a compatible flow that 
is bounded. From this flow we then compute a placement for the distribution. We start by 
defining a flow on finite trees (which can then also be used to deal with finite subtrees of a 
given infinite tree). 

Lemma 6.15. For every finite tree t and every WMSO- definable K-sparse distribution D 
over t, there exists a WMSO-definable flow f that is compatible with D, bounded by 2K + 1, 
and such that for each node x there is a zone Z x rooted in x of frontier F x with 

(1) D(Z X ) + f(x) = \Z X \ + K(\F X \ - I), 

(2) and f(y) > K for all y G F x different from x. 

Proof. First note that (1) implies f(x) > —K for each x since D is i^-sparse. We define the 
values /(x) and the zones Z x inductively starting at the leaves. These definitions directly 
imply that / is compatible with D. 

The base case of a leaf x is straightforward: we set f(x) to be 1 — D(x) and Z x = {x}. 
By the hypothesis of sparsity, we have that D(x) < K + 1 and hence |/(x)[ is bounded by 
2K + 1. 

Let x be an inner node and assume that the values f(x0), /(xl) and the zones Z x o, Z x \ 
are already defined. We set /'(xO) = min(K , f (x0)) and f'(xl) = mm(K , f (xl)) . Let us 
now define f(x) to be 1 + /'(xO) + /'(xl) — D(x) and Z x to contain the node x, the nodes 
of Z x q if /(xO) < K, and the nodes of Z x \ if /(xl) < K. By the hypothesis of induction, 
we indeed obtain that D{Z X ) + /(x) = \Z X \ + if(|.F x | — 1). We illustrate this only for the 
case /'(xO) < K and /'(xl) = K, the other cases are similar. In this case Z x = Z x q U {x}, 
\F X \ = | F x q | + 1 , and /(x) = 1 + f(xO) + K — D(x). From this we get the following sequence 
of equalities: 



D{Z x ) + f{x) = D(Z x0 ) + D{x) + 1 + /(xO) + K - D(x) 
= D(Z x0 ) + /(xO) + 1 + K 
= \Z x0 \ + K(\F x0 \-l) + l + K 
= \Z x0 \ + 1 + K\F x0 \ 
= \Z X \ + K{\F X \-1). 

From the definition of /(x) it is clear that /(x) < 2K + 1. As mentioned before, /(x) > — K 
and thus |/(x)| is bounded by 2K + 1. 

It is obvious that this flow, which has only a bounded number of possible values and 
is defined inductively, is WMSO-definable. This definition can be done by requiring the 
existence of sets X_k, . . . ,X2K+i such that a node x is in Xi iff /(x) = i. This can be 
directly expressed if x is a leaf. Otherwise it is a simple statement on the membership of 
the successors xO and xl of x. □ 
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< 5K 




Figure 3: Proof of Lemma l6.16t The flow in nodes of type (a) is bounded by 7K. 

Lemma 6.16. For every infinite tree t and every WMSO- definable K -sparse distribution D, 
there exists a WMSO-definable flow f bounded by 7K that is compatible with D such that 
f(e) = 0. 

Proof. As a K-sparse distribution restricted to a finite subtree of t is also if-sparse on this 
subtree, we can apply Lemma [6. 151 to define the values of f(x) for the nodes x that are not 
on infinite branches of t. 

Let B be the set of nodes appearing in some infinite branch. We define inductively for 
any node x £ B a flow f(x) such that < f{x) < 7K. We only consider non-negative 
values since on infinite branches we never reach a leaf and hence there is no need for an 
upward flow. 

We start by setting /(e) = 0. For x ^ e let y € B be the father of x. Three cases may 
happen. If at node y only one of its children is in B (case (a)), we forward everything to 
this node. Otherwise (cases (b) and (c)), we forward at most 5K to the left child and the 
rest to the right child. The formal definitions are given below, where the the max operator 
is only used to avoid negative flows. 

(a) If x is the only child of y in B, then we set 

f(x) = max(0, f(y) + D(y) - f(x') - 1) 

where x' is the other child of y. 

(b) If the two children of y are in B and x is the left child, then we set 

f(x) = max(0, min(5K, f(y) + D{y) - 1)). 

(c) If the two children of y are in B and x is the right child, then we set 

f(x) = max(0, f(y) + D(y) - 1 - 5K). 

Obviously, f{x) > in all cases. We show that if x is of type (b) or (c), then f{x) < 5K, 
and if x is of type (a), then f(x) < 7K. 

In case (b), f(x) < 5K follows directly from the definition and in case (c) from f(y) < 
7K (by induction) and D(y) < 3K + 1 (D is if-sparse). 

For x as in (a) we cannot use a local argument but we have to go upwards until we 
reach a node that has a flow of at most 5K. Such a node must exist because we eventually 
meet a node of type (b) or (c), or the root, which has flow 0. All nodes we met before must 
be of type (a). 
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The following definitions are illustrated in Figure El The choice of the yi being the left 
successors in the figure is arbitrary and only for matters of presentation. Let y n ,---,yi be 
such that y n is the father of x, y%-\ is the father of yi for all i G {2, ... ,n}, f(yi) < 5K, 
and f(y2), ■ ■ ■ , f{y n ) > 5K. As mentioned above, 2/2, • • • , 2M are of type (a). 

Let xi,...,x n be such that yi is the father of Xi, x n ^ x, and Xi 7^ y%+\ for i 6 
{1, . . . ,n — 1}, i.e., Xi is the brother of yi+x- For the Xi the flow is defined using Lemma [6. 151 
because they are not lying on an infinite branch. Hence there are zones Z Xi rooted at Xi of 
frontier F x . such that 

D(Z Xi ) + f( Xi ) = \Z Xi \ + K(\F Xi \ - 1). (6.1) 
Let Z = Uf=i ({Hi} U %xi) and let F be the frontier of Z. Then 



\Z\ = n + Y,\Z Xl \ (6.2) 

i=l 

n 

\F\ = 2 + ^(1^1-1) (6.3) 

i=l 

n n 

Y,D{yi) = D{Z)-Y J D{Z X% ) (6.4) 

i=l i=l 

Since y2, ■ ■ ■ ,y n are of type (a) and furthermore their flow is bigger than 5K (and hence 
bigger than 0), we get 

n 

f{x) = f{y l ) + Y J {D{y l )-l-f{x i )). 
i=i 

We know that f{y\) < 5K and hence it remains to be shown that ^2i = i(F>(yi) — 1 — f{xij) < 
2K. This can be deduced as follows: 



8=1 8=1 

n 



D{Z)-n-Y J i\Z Xi \+K(\F Xi \-l)) 

8=1 

n 

D(Z)-n-^(|F|-2)-^|^| 



i=l 

K -sparse 

< |Z| + A|F| -n-K{\F\ -2)-J2\Z Xl \ 

i=l 

2K 

That this flow / is compatible with D and that it is WMSO-definable can easily be deduced 
from the definitions. □ 

We are now ready to establish our placement Lemma. 

Lemma 6.17 (car parking). For every tree t and every WMSO-definable K-sparse distri- 
bution D, there exists a WMSO-definable placement for D. If t is finite, we additionally 
require that D(dom(t)) < \dom{t)\. 
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Proof. According to Lemmas 16.151 and 16.161 we know that there is a WMSO-definable flow / 
that is compatible with D. For simplicity, we first assume that /(e) = 0, which is always the 
case for infinite trees (Lemma l6.16p . If /(e) > for finite trees, then we can simply redefine 
/(e) = without changing the property of / being compatible with D. If t is finite and 
/(e) < 0, then we cannot simply set f{e) = because this would affect the compatibility of 
/ with D. At the end of the proof we briefly explain how to treat this case. 
The general strategy for defining the placement is the following: 

• From each node we send all the cars except one to its neighbors. The number of 
cars sent to each neighbor is described by the flow. 

• If we follow this strategy, then each edge in t is crossed by the cars in only one 
direction. Hence, a car cannot visit the same node twice. This means that it might 
be sent up in the tree for some steps and from some point onwards it is only sent 
downwards. 

• To be sure that each car will be parked after finite time we order all the cars that 
cross a node (as described by D and /) according to a fixed strategy and we also 
fix a scheme for distributing the cars to the neighboring nodes. 

• This ordering will ensure that the index of a car decreases each time it is sent down 
in the tree. As described above, each car is sent up in the tree only a finite number 
of times. Hence, if we always park the car that is first in the ordering at a specific 
node, then each car will eventually be parked. 

To show that this strategy can be realized by WMSO-formulas we first describe the ordering 
of cars that we use and then define formulas 

• sendj(x,y) meaning that the ith car at node x is sent to node y. 

• drivejj(x, y) meaning that the ith car at node x is sent to y and is car number j at 

y- 

• start, (x) meaning that the ith car at node x does not come from another node. 

• itinerary^x, X\, . . . ,Xx,y) meaning that the zth car at node x will be parked at 
node y using an itinerary that is described by the sets X±,..., Xk- 

To define these formulas we first have to introduce some notation. To avoid case distinctions 
we define f + (x) = max(/(x),0) and f~(x) = min(/(x),0). Furthermore, we assume by 
convention that for a leaf x the values f + (x0), / + (xl), /~(x0), /~(xl) are defined, and are 
all set to 0. 

Then the number of cars crossing a node x is / + (x) + / _ (x0) + /~(xl) + D{x). Since 
all the values involved in this expression are bounded and since we can increase K without 
affecting the if-sparsity of D, we can assume that / + (x) + f~(x0) + /~(xl) + D(x) < K 
for all x. 

Since / is WMSO-definable, we can also assume that there are formulas (f>f{x) and 
4>^(x) defining / + and / _ . Then expressions of the form i\ + / + (x) + f~(y) = ii for 
il,*2 € [K] can easily be expressed as Boolean combinations of the formulas cj>f and <ft^ . 
The use of expressions of this kind simplifies the presentation of the formulas. 

We start by giving the orderings used in the definitions of the formulas send, and 
drive, j . The cars that cross a node x will be distributed in the following order that we refer 
to as the distribution order: 



j^(x) /"(xO) /-(xl) D(x) 
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That is, we first distribute the cars that come from the father of x, then the cars that come 
from the left child of x, and so on. It remains to fix where to send the cars. 

7 T W 



/-(*) 



This means that the first car in the distribution order is parked in the node x. The next 
cars are sent to the father of x if f(x) is negative. The following cars in the distribution 
order are send to the left child of x and the remaining cars to the right child of x. 

To illustrate this, consider the following example with x = yO, f + (y) = 3, f~(x) = 5, 
and f+(yl) = 7. 



|3 

y^7 



Let us first see how the cars at y are ordered according to the distribution order. The first 
three are the ones coming from the father of y. The next 5 are those coming from x. There 
are no cars coming from yl, and the last cars are those from D(y). Now, we would like to 
know what happens to the 4th car at x. The first car at x stays at x. The next 5 cars are 
sent to y, that is, the fourth car at x is the third car sent from x to y. According to the 
distribution order at y described before, this car becomes car number 6 at y. 

This is expressed by the following formulas, where the first two are only defined for 
2 < i < K because the first car is always kept at the current position. 

• For 2 < i < K: 

sendj(x,y) := [(x = yO V x = yl) A 1 < i < C±] 

V [xO = y A Ci < i < C 2 ] 

V [xl = y A C 2 < i < C 3 ] 

Here d = 1 + f~(x), C 2 = 1 + f~(x) + f + (xO), and C 3 = 1 + f~(x) + f+(xO) + 
f+(xl). 

• For 2 < i < K: 

dnve itj (x,y) := sendj(x,y)A 

( [ x = y A j = i - 1 + f+(y)} 

V [x = yl A j = i - 1 + f+(y) + f-(yO)] 

V [xO = y A j = i - 1 - f~(x)] 

V [xl = y A j = i - 1 - f-(x) - /+(x0)]) 

• For 1 < % < K: 

starti(x) := /+(x) + /"(xO) + /"(xl) < i < / + (x) + /"(xO) + /"(xl) + D{x) 

• For 1 < % < K: 

itineraryj(x, X ± , . . . , X K , y) := disjoint(Xi, . . . , X K )A 

x <E Xi A start j(x) A X 1 = {y} 
K 

A f\(\/zeXj V 3z'eXj> : drive iij /(z,z / )) 
i=2 j'e[K] 

This formula states that the ith car at x starts there. The free set variables describe 
the set of positions that this car crosses, where a position is included in X m if the 



31 



T. COLCOMBET AND C. LODING 



car is the mth one at this position. Finally, it states that y is the only position 
where the car is first in the ordering. Hence it will stop at y. 
Then the formulas ipi(x,y) defining the placement are given by 



As mentioned at the beginning of the proof we now discuss how to treat the case /(e) < 0. 
Recall that this may only happen for finite trees. In general, simply redefining /(e) = 
may lead to a flow that is not compatible with D anymore. Therefore, we have to use a 
different strategy. 

The strategy for distributing the cars described above would lead to /~(e) cars that 
"get stuck" at the root because, following the flow, they should be sent upwards, and this 
is not possible. This means that there are at most K cars (for simplicity assume exactly 
K cars) that start at some node but never arrive at some destination, i.e., there are nodes 
x\, . . . , xk and ii, . . . , ijc € [K] such that 



is satisfied. From the assumption D(dom(t)) < \dom{t)\ we can conclude that there remain 
at least K nodes where no car is parked, i.e., K nodes which are not image of the function 
defined by the tpi's. Let yi, ■ ■ ■ ,yx be the K first such nodes for some WMSO-definable 
order (this is possible since the tree is finite). We can now extend this function by ordering 
the x\, . . . , xk and map the ijth. car from Xj to yj. Note that these definitions are expressible 
in WMSO. In this way, we obtain a modification of the function defined by the ipiS into a 
placement for D. □ 

6.6. Proof of Theorem 14.31 We can now prove Theorem 14.31 as stated in Section H] by 
combining the previous results. 
For X € Atoms let 



We construct a formula <pmd(X,y) that associates to each X £ Atoms its index y = 
Index(X). From the definitions of I(X), Slndex(X), and Blndex(X) it is clear that we 
can construct such a formula. Note that in the definition of this formula, we do not have 
to explicitly represent infinite sets (though I(X) may be infinite) because we can construct 
a WMSO formula 4>i mp (X,x) that associates to X £ Atoms its important nodes. From this 
one can construct WMSO definitions of Slndex(X) and Blndex(X), and hence the formula 
ind (X,y) is also WMSO. 

Then, we compute the distribution D defined by D(x) = \Index~ 1 (x)\. Since this 
distribution is (K s + iT^J-sparse by Lemmas 16.61 and [ 6.121 it is also WMSO-definable using 
the formula 4>i n d- Let K be a constant such that D{x) < K for all nodes x. Applying 
Lemma |6.17| we obtain a WMSO-definable placement P for D. One should note that for 
finite t the assumption D(dom(t)) < \dom(t)\ is satisfied because D{dom{t)) is the number 
of elements in the set E, and I(t) being isomorphic to V F (E) implies that t has at least as 
many elements as E. 

Let vpi(xi,X2) for i € [K] be the formulas defining P. Now, we order all the X € 
Atoms with the same index. A possible definition for such an ordering is X < Y if the 
lexicographically smallest node that is not in X n Y is in X. Then one can construct 



ipi{x,y) = 3Xl, . . .X^(itineraryj(x,Xi, . . . ,X K ,y))- 



start^. (xj) A -Byi^i^x^y)) 



Index (X) 



Blndex(X) if I(X) is an infinite branch, 
SIndex(X) otherwise. 
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WMSO-formulas 6i(X) stating that X is the ith set in the ordering among those that have 
the same index as X. 

The WMSO-formula Code(X,x) that attaches X to its final position x is then defined 
as follows: 

Code(X,x) = \/ (0i(X) A3y(0 ind (X,j/) A^(y,x)). 
ie[K] 

□ 

One should note here that without the results from Subsection 16.51 we can obtain a 
weaker version of Corollary 14.41 by replacing the interpretation I2 by a transduction, i.e., 
an interpretation that can use a fixed number K of copies of the given structure. Such 
a transduction can be realized using the formula ^; n d instead of Code. In particular one 
could use this weaker version of the result in all the applications, but at the price of some 
notational and technical overheads. 

From this point of view, Lemma 16.171 can also be seen as a result on the question 
under which conditions a (W)MSO-transduction is equivalent to a (W)MSO-interpretation 
on binary trees. Namely, if the distribution defined by the transduction, i.e., the function 
assigning to each node the number of times it is used in the result of the transduction, is 
AT-sparse (for some K) on each tree. And this presentation can be used either for WMSO- 
transductions or MSO-transductions. 

Acknowledgments. We are particularly grateful to Vince Barany for his contribution to 
the random-graph proof, as well as his comments on previous versions of this work. We also 
thank Achim Blumensath for commenting on earlier stages of this work, and the anonymous 
referees for their helpful comments. 

References 

[Bar06] Vince Barany. Invariants of automatic presentations and semi-synchronous transductions. In 
STACS 2006, volume 3884 of LNCS, pages 289-300. Springer, 2006. 

[BG00] Achim Blumensath and Erich Gradel. Automatic Structures. In Proceedings of 15th IEEE Sym- 
posium on Logic in Computer Science LICS 2000, pages 51-62, 2000. 

[BG04] Achim Blumensath and Erich Gradel. Finite presentations of infinite structures: Automata and 
interpretations. Theory of Computing Systems, 37:641 - 674, 2004. 

[Blu99] Achim Blumensath. Automatic structures. Diploma thesis, RWTH- Aachen, 1999. 

[BluOl] Achim Blumensath. Prefix-recognisable graphs and monadic second-order logic. Technical Re- 
port AIB-06-2001, RWTH Aachen, May 2001. 

[Cau96] Didier Caucal. On infinite transition graphs having a decidable monadic theory. In ICALP'96, 
volume 1099 of LNCS, pages 194-205. Springer, 1996. 

[Cau02] Didier Caucal. On infinite terms having a decidable monadic theory. In MFCS '02, volume 2420 
of LNCS, pages 165-176. Springer, 2002. 

[CO06] Bruno Courcelle and Sang-il Oum. Vertex-minors, monadic second-order logic and a conjecture 
by seese. Journal of Combinatorial Theory, Series B, 2006. to appear. 

[Col02] Thomas Colcombet. On families of graphs having a decidable first order theory with reachability. 
In Proceedings of ICALP 2002, volume 2380 of LNCS, pages 98-109. Springer, 2002. 

[Col04] Thomas Colcombet. Equational presentations of tree-automatic structures. In Workshop on 
Automata, Structures and Logic, Auckland, New Zealand, December 2004. 

[Cou89] Bruno Courcelle. The monadic second order logic of graphs II: Infinite graphs of bounded width. 
Mathematical System Theory, 21:187-222, 1989. 



36 



T. COLCOMBET AND C. LODING 



[Cou97] Bruno Courcelle. The expression of graph properties and graph transformations in monadic 
second-order logic. In G. Rozenberg, editor, Handbook of graph grammars and computing by 
graph transformations, Vol. 1 : Foundations, chapter 5, pages 313-400. World Scientific, 1997. 

[Cou04] Bruno Courcelle. Clique-width of countable graphs: a compactness property. Discrete Mathe- 
matics, 276(1-3): 127-148, 2004. 

[CT02] Olivier Carton and Wolfgang Thomas. The monadic theory of morphic infinite words and gen- 
eralizations. Information and Computation, 176(l):51-65, 2002. 

[CW03] Arnaud Carayol and Stefan Wohrle. The Caucal hierarchy of infinite graphs in terms of logic 
and higher-order pushdown automata. In FSTTCS'03, volume 2914 of LNCS, pages 112-123. 
Springer, 2003. 

[DT90] Max Dauchet and Sophie Tison. The theory of ground rewrite systems is decidable. In LICS'90, 

pages 242-248. IEEE, 1990. 
[ECH+92] David B. A. Epstein, James W. Cannon, Derek F. Holt, Silvio V. F. Levy, Michael S. Paterson, 

and William P. Thurston. Word processing in groups. Jones and Barlett publishers, 1992. 
[ER66] Calvin C. Elgot and Michael O. Rabin. Decidability and undecidability of extensions of second 

(first) order theory of (generalized) successor. Journal of Symbolic Logic, 31(2):169-181, 1966. 
[Hod83] Bernard R. Hodgson. Decidabilite par automate fini. Ann. Sci. Math. Quebec, 7(3):39-57, 1983. 
[Hod93] Wilfrid Hodges. Model Theory. Cambridge University Press, 1993. 

[KL02] Dietrich Kuske and Markus Lohrey. On the theory of one-step rewriting in trace monoids. In 

ICALP'02, volume 2380 of LNCS, pages 752-763. Springer, 2002. 
[KN95] Bakhadyr Khoussainov and Anil Nerode. Automatic presentations of structures. In Workshop 

LCC '94, volume 960 of LNCS, pages 367-392. Springer, 1995. 
[KNRS04] Bakhadyr Khoussainov, Andre Nies, Sasha Rubin, and Frank Stephan. Automatic structures: 

Richness and limitations. In LICS'04, pages 44-53. IEEE Computer Society, 2004. 
[KNUW05] Teodor Knapik, Damian Niwihski, Pawel Urzyczyn, and Igor Walukiewicz. Unsafe grammars, 

panic automata, and decidability. In ICALP'05, volume 3580 of LNCS, pages 1450-1461. 

Springer, 2005. 

[KRS04] Bakhadyr Khoussainov, Sasha Rubin, and Frank Stephan. Definability and regularity in auto- 
matic structures. In STACS'04, volume 2996 of LNCS, pages 440-451. Springer, 2004. 

[Mad03] P. Madhusudan. Model-checking trace event structures. In LICS'03, pages 371-380. IEEE Com- 
puter Society, 2003. 

[MR99] Johann A. Makowsky and Udi Rotics. On the clique- width of graphs with few p4's. International 

Journal of Foundations of Computer Science, 10(3):329-348, 1999. 
[MS85] David E. Muller and Paul E. Schupp. The theory of ends, pushdown automata, and second-order 

logic. Theoretical Computer Science, 37:51-75, 1985. 
[Rub04] Sasha Rubin. Automatic Structures. PhD thesis, University of Auckland, New Zealand, 2004. 
[See91] Detlef Seese. The structure of models of decidable monadic theories of graphs. Annals of Pure 

and Applied Logic, 53:169-195, 1991. 
[Tho97] Wolfgang Thomas. Languages, automata, and logic. In G. Rozenberg and A. Salomaa, editors, 

Handbook of Formal Language Theory, volume III, pages 389-455. Springer, 1997. 
[Tho03] Wolfgang Thomas. Constructing infinite graphs with a decidable mso-theory. In Proceedings of 

MFCS 2003, volume 2747 of LNCS, pages 113-124. Springer, 2003. 



This work is licensed under the Creative Commons Attribution-NoDerivs License. To view 
a copy of this license, visit http://creativecommons.0rg/iicenses/by-nd/2.o/ or send a 
letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA. 



